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A DESIGN APPARATUS AND A METHOD FOR GENERATING AN 
IMPLEMENTABLE DESCRIPTION OF A DIGITAL SYSTEM 

Field of the invention 

10 

The present invention is situated in the 
field of design of systems* More specifically, the present 
invention is related to a design apparatus for digital 
systems, generating implementable descriptions of said 
15 systems. 

The present invention is also related to a 
method for generating implementable descriptions of said 
systems . 



20 State of the art 

The current need for digital systems forces 
contemporary system designers with ever increasing design 
complexities in most applications where dedicated 
processors and other digital hardware are used, demand for 

25 new systems is rising and development time is shortening. 
As an example, currently there is a high interest in 
digital communication equipment for public access networks. 
Examples are modems for Asymmetric Digital Subscriber Loop 
(ADSL) applications, and up- and downstream Hybrid Fiber- 

30 Coax (HFC) communication. These modems are preferably 
implemented in all -digital hardware using digital signal 
processing (DSP) techniques. This is because of the 
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complexity of the data processing that they require. 
Besides this, these systems also need short development 
cycles. This calls for a design methodology that starts at 
high level and that provides for design automation as much 
5 as possible. 

One frequently used modeling description 
language is VHDL (VHSIC Hardware Description Language) , 
which has been accepted as an IEEE standard since 1987. 
VHDL is a programming environment that produces a 

10 description of a piece of hardware. Additions to standard 
VHDL can be to implement features of Object Oriented 
Programming Languages into VHDL. This was described in the 
paper OO-VHDL (Computer, October 1995, pages 18-26). 
Another frequently used modeling description language is 

15 VERILOG. 

A number of commercially available system 
environments support the design of complex DSP systems. 

MATLAB of Mathworks Inc offers the 
possibility of exploration at the algorithmic level. It 

20 uses the data-vector as the basic semantical feature. 
However, the developed MATLAB description has no 
relationship to a digital hardware implementation, nor does 
MATLAB support the synthesis of digital circuits. 

SPW of Alta Group offers a toolkit for the 

25 simulation of these kind of systems. SPW is typically used 
to simulate data-flow semantics. Data-flow semantics define 
explicit algorithmic iteration, whereas data-vector 
semantics do not. SPW relies on an extensive library and 
toolkit to develop systems. Unlike MATLAB, the initial 

30 description is a block-based description. Each block used 
in the systems appears in two different formats, (a 
simulatable and a synthesizable version) which results in 



possible inconsistency. 

COSSAP of Synopsys performs the same kind of 
system exploration as SPW. 

DC and BC are products of Synopsys that 
support system synthesis. These products do not provide 
sufficient algorithm exploration functions. 

Because all of these tools support only part 
of the desired functionality, contemporary digital systems 
are designed typically with a mix of these environments. 
For example, a designer might do algorithmic exploration in 
MATLAB, then do architecture definition with SPW, and 
finally map the architecture definition to an 
implementation in DC. 



Aims of the invention 

It is an aim of the present invention to 
disclose a design apparatus that allows to generate from a 
behavioral description of a digital system, an 
implementable description for said system. 

It is another aim of the present invention to 
disclose a the design apparatus that allows for design, 
digital systems starting from a data vector or data flow 
description and generating an implementable level such as 
VHDL. A further aim is to perform such design tasks within 
one object oriented environment. 

Another aim is to provide a means comprised 
in said design apparatus for simulating the behavior of the 
system at any level of the design stage or trajectory. 



Summary of the invention 

A first aspect of the present invention 
concerns a design apparatus compiled on a computer 



environment for generating from a behavioral description of 
a system comprising at least one digital system part, an 
implementable description for said system, said behavioral 
description being represented on said computer environment 
5 as a first set of objects with a first set of relations 
therebetween, said implementable description being 
represented on said computer environment as a second set of 
objects with a second set of relations therebetween, said 
first and second set of objects being part of a design 
10 environment . 

A behavioral description is a description 
which substantiates the desired behavior of a system in a 
formal way. In general, a behavioral description is not 
readily implementable since it is a high-level description, 
15 and it only describes an abstract version of the system 
that can be simulated. An implementable description is a 
more concrete description that is, in contrast to a 
behavioral description, detailed enough to be implemented 
in software to provide an approximative simulation of real- 

20 life behavior or in hardware to provide a working 
semiconductor circuit . 

With design environment is meant an 
environment in which algorithms can be produced and run by 
interpretion or compilation. 

25 with objects is meant a data structure which 

shows all the characteristics of an object from an object 
oriented programming language, such as described in "Object 
Oriented Design" (G. Booch, Benj amin/Cummings Publishing, 
Redwood City, Calif., 1991). 

30 Said first and second set of objects are 

preferably part of a single design environment. 

Said design environment comprises preferably 
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an Object Oriented Programming Language (OOPL) . Said OOPL 
can be C++. 

Said design environment is preferably an open 
environment wherein new objects can be created. A closed 
5 environment will not provide the flexibility that can be 
obtained with an open environment and will limit the 
possibilities of the user. 

Preferably, at least part of the input 
signals and output signals of said first set of objects are 

10 at least part of the input signals and output signals of 
said second set of objects. Essentially all of the input 
signals and output signals of said first set of objects can 
be essentially all of the input signals and output signals 
of said second set of objects. 

15 At least part of the input signals and output 

signals of said behavioral description are preferably at 
least part of the input signals and output signals of said 
implementable description. Essentially all of the input 
signals and output signals of said behavioral description 

2 0 can be essentially all of the input signals and output 
signals of said implementable description. 

Said first set of objects has preferably 
first semantics and said second set of objects has 
preferably second semantics. With semantics is meant the 

25 model of computation. Said first semantics is preferably a 
data-vector model and/or a data-flow model. Said second 
semantics is preferably a Finite State Machine Data Path 
(FSMD) data structure, comprising a control part and a 
data processing part, the data processing part being 

30 modeled by a signal flow graph (SFG) data structure and the 
control part being modeled by a FSM data structure. The 
terms FSMD and SFr are used interchangeably throughout the 
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text . 

Preferably, bhe impact in said implementable 
description of at least a part of the objects of said 
second set of objects is essentially the same as the impact 
5 in said behavioral description of at least a part of the 
objects of said first set of objects. 

Preferably, the impact in said implementable 
description of essentially all of the objects of said 
second set of objects is essentially the same as the impact 
10 in said behavioral description of essentially all of the 
objects of said first set of objects. 
D With impact is meant not only the function, 

jg but also the way the object interacts with its environment 

^ from an external point of view. A way of rephrasing this is 

y| 15 that the same interface for providing input and collecting 

T_: output is present. This does not mean that the actual 

s implementation of the data -processing between input and 

IP output is the same. The implementation is embodied by 

^ objects, which can be completely different but perform a 

p 20 same function. In an OOPL , the use of methods of an object 

; r ~ without knowing its actual implementation is referred to as 

information hiding. 

The design apparatus preferably further 
comprises means for simulating the behavior of said system 
25 said means simulating the behavior of said behavioral 
description, said implementable description or any 
intermediate description therebetween. Said intermediate 
description can be obtained after one or several refining 
steps from said behavioral description. 
30 Preferably, at least part of said second set 

of objects is derived from objects belonging to said first 
set of objects. This can be done by using the inheritance 
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functionalities provided in an OOPL. Essentially all of 
said second set of objects can be derived from objects 
belonging to said first set of objects. 

Said implement able description can be at 
5 least partly obtained by refining said behavioral 
description. Said implementable description can be 
essentially obtained by refining said behavioral 
description. Preferably, said refining comprises the 
refining of objects. 

10 The design apparatus can further comprise 

means to derive said first set of objects from a vector 
description, preferably a MATLAB description, describing 
said system as a set of operations on data vectors, means 
for simulating statically or demand-driven scheduled 

15 dataflow on said dataflow description and/or means for 
clock-cycle true simulating said digital system using said 
dataflow description and/or one or more of said SFG data 
structures . 

In a preferred embodiment, said implementable 
20 description is an architecture description of said system, 
said system advantageously further comprising means for 
translating said architecture description into a 
synthesizable description of said system, said 
synthesizable description being directly implementable in 
25 hardware. Said synthesizable description is preferably a 
netlist of hardware building blocks. Said hardware is 
preferably a semiconductor chip or a electronic circuit 
comprising semiconductor chips. 

A synthesizable description is a description 
30 of the architecture of a semiconductor that can be 
synthesized without further processing of the description. 
An example is a VHDL description. 



Said means for translating said architecture 
description into a synbhesizable description can be 
Cathedral -3 or Synopsys DC. 

5 A second aspect of the present invention is a 

method for designing a system comprising at least one 
digital part, comprising a refining step wherein a 
behavioral description of said system is transformed into 
an implementable description of said system, said 
10 behavioral description being represented as a first set of 
objects with a first set of relations therebetween and said 
implementable description being represented as a second set 
of objects with a second set of relations therebetween. 

Said refining step preferably comprises 
15 translating behavioral characteristics at least partly into 
structural characteristics. Said refining step can comprise 
translating behavioral characteristics completely into 
structural characteristics. 

Said method can further comprise a simulation 
20 step in which the behavior of said behavioral description, 
said implementable description and/or any intermediate 
description therebetween is simulated. 

Said refining step can comprises the addition 
of new objects, permitting interaction with existing 
25 objects, and adjustments to said existing objects allowing 
said interaction. 

Preferably, said refining step is performed 
in an open environment and comprises expansion of existing 
objects. Expansion of existing objects can include the 
30 addition to an object of methods that create new objects. 
Said object is said to be expanded with the new objects. 
The use of expandable objects allows to use meta-code 



generation: creating expandable objects implies an indirect 
creation of the new objects. 

Said behavioral description and said 
implement able description are preferably represented in a 
5 single design environment, said single design environment 
advantageously being an Object Oriented Programming 
Language, preferably C++. 

Preferably, said first set of objects has 
first semantics and said second set of objects has second 
10 semantics. Said first semantics is preferably a data-vector 
model and/or a data-flow model. Said second semantics is 
preferably an SFG data structure. 

The refining step comprises preferably a 
first refining step wherein said behavioral description 
15 being a data-vector model is at least partly transformed 
into a data-flow model. Advantageously, said data-flow 
model is an untimed floating point data-flow model. 

Said refining step preferably further 
comprises a second refining step wherein said data-flow 
20 model is at least partly transformed into an SFG model. 
Said data-flow model can be completely transformed into an 
SFG model. 

In a preferred embodiment, said first 
refining step comprises the steps of determining the input 

25 vector lengths of input, output and intermediate signals, 
determining the amount of parallelism of operations that 
process input signals under the form of a vector to output 
signals, determination of objects, connections between 
objects and signals between objects of said data-flow 

30 model, and determining the wordlength of said signals 
between objects. In the sequel of this application, the 
term "actors" is also used to denote objects. Connections 



10 

between objects are denoted as "edges" and signals between 
objects are denoted as "tokens". Said step of determining 
the amount of parallelism can preferably comprise 
determining the amount of parallelism for every data vector 
and reducing the unspecified communication bandwidth of 
said data-vector model to a fixed number of communication 
buses in said data-flow model. Said step of determination 
of actors, edges and tokens of said data-flow model 
preferably comprises defining one or a group of data 
vectors in said first data-vector model as actors; defining 
data precedences crossing actor bounds, as edges, said 
edges behaving like queues and transporting tokens between 
actors; construct a system schedule and run a simulation on 
a computer environment. Said second refining step comprises 
preferably transforming said tokens from floating point to 
fixed point. Preferably, said SFG model is a timed fixed 
point SFG model . 

Said second set of objects with said second 
set of relations therebetween are preferably at least 
partly derived from said first set of objects with said 
first set of relations therebetween. Objects belonging to 
said second set of objects are preferably new objects, 
identical with and/or derived by inheritance from objects 
from said first set of objects, or a combination thereof. 

Several of said SFG models can be combined 
with a finite state machine description resulting in an 
implementable description. Said implementable description 
can be transformed to synthesizable code, said 
synthesizable code preferably being VHDL code. 

Another aspect of the present invention is a 
method for simulating a system, wherein a description of a 
system is transformed into compilable C++ code. 
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Preferably, said description is an SFG data 
structure and said compilable C++ code is used to perform 
clock cycle true simulations. 

Several SFG data structures can be combined 
with a finite state machine description resulting in an 
implementable description, said implementable description 
being said compilable C++ code suitable for simulating said 
system as software. 

A clock-cycle true simulation of a system 
uses one or more SFG data structures. 

Said clock-cycle true simulation can be an 
expectation-based simulation, said expectation-based 
simulation comprising the steps of: annotating a token age 
to every token; annotating a queue age to every queue; 
increasing token age according to the token aging rules and 
with the travel delay for every queue that has transported 
the token; increasing queue age with the iteration time of 
the actor steering the queue, and; checking whether token 
age is never smaller than queue age throughout the 
simulation. 

Another aspect of the present invention is a 
hardware circuit or a software simulation of a hardware 
circuit designed with the design apparatus as recited 
higher. 

Another aspect of the present invention is a 
hardware circuit or a software simulation of a hardware 
circuit designed with the method as recited higher. 

Detail ed description of the invention 

The present invention will be further 
explained by means of examples, which does not limit the 
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scope of the invention as claimed. 
Short description of the drawing s 

In figures 1A, IB, 1C and ID, the overall design 
5 methodology according to an embodiment of the invention is 
described. 

In figure 2, a targeted architecture of a system that is to 
be designed according to the invention is described. 
In figure 3, the C++ modeling levels of target architecture 
10 are depicted. 

In figure 4, an SDF model of the PN correlator of the 
target architecture of figure 2 is shown. 

In figure 5, a CSDF model of the PN correlator is 
described. 

15 In figure 6, a MATLAB Dataflow model of the PN correlator 
is shown. 

In figure 7 , the SFG modeling concepts are depicted. 

In figure 8, the implied description of the max actor is 

described. 

20 In figure 9, example implementations for different 
expectations are given. 

In figure 10, an overview of expectation based simulation 
is shown. 

In figure 11, the code in OCAPI, or design environment of 
25 the invention, for a correlator processor is given. 

In figure 12, the resulting circuit for datapath and 

controller is hierarchically drawn. 

Figure 13 describes a DECT Base station setup. 

Figure 14 shows the front -end processing of the DECT 
30 transceiver. 

In Figure 15, a part of the central VLIW controller 

description for the DECT transceiver ASIC is shown. 
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In figure 16, the use of overloading to construct the 
signal flowgraph data structure is shown. 

In figure 17, an example C++ code fragment and its 
corresponding data structure is described. 
5 In figure 18, a graphical and C++-textual description of 
the same FSM is shown. 

In figure 19, the final system architecture of the DECT 
transceiver is shown. 

In figure 20, a data-flow target architecture is shown. 
0 In figure 21, the simulation of one cycle in a system with 
three components is shown. 

In figure 22, the implementation and simulation strategy is 
depicted. 

In figure 23, an end-to-end model of a QAM transmission 
5 system is shown. 

In figure 24, the system contents for the QAM transmission 
system is described. 



The present invention can be described as a 
design environment for performing subsequent gradual 
refinement of descriptions of digital systems within one 
and the same object oriented programming language 
environment. The lowest level is semantically equivalent to 
a behavioral description at the register transfer (RT) 
level . 

A preferred embodiment of the invention 
comprising the design method according to the invention is 
called OCAPI. OCAPI is part of a global design methodology 
concept S0C++. OCAPI includes both a design environment in 
an object oriented programming language and a design 
method. OCAPI differentiates from current systems that 
support architecture definition (SPW, COSSAP) in the way 
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that a designer is guided from the MATLAB level to the 
register transfer level. This way, combined semantic and 
syntactic translations in the design flow are avoided. 

• The designer is offered a single coding framework in an 
object oriented programming language, such as C++, to 
express refinements to the behavior. An open environment 
is used, rather than the usual interf ace-and-module 
approach. 

• The coding framework is a container of design concepts, 
used in traditional design practice. Some example design 
concepts currently supported are simulation queues, 
finite state machines, signal flowgraphs, hybrid 
floating/fixed point data types, operation profiling and 
signal range statistics. The concepts take the form of 
object oriented programming language objects (referred to 
as object in the remainder of this text), that can be 
instantiated and related to each other. 

• With this set of objects, a gradual refinement design 
route is offered: more abstract design concepts can be 
replaced with more detailed ones in a gradual way. Also, 
design concepts are combined in an orthogonal way: 
quantization effects and clock cycles (operation/operator 
mapping) for instance are two architecture features that 
can be investigated separately. Next, the different 
design hierarchies can be freely intermixed because of 
this object-oriented approach. For instance, it is 
possible to simulate half of the description at fixed 
point level, while the other half is still in floating 
point . 

• The use of a single object oriented programming language 
framework in OCAPI allows fast design iteration, which is 
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not possible in the typical nowadays hybrid approach. 

Comparing to existing data-flow-based systems 
like SPW and COSSAP we see that the algorithm iterations 
can be freely chosen. Comparing to existing hardware design 
5 environments like DC or BC, we see that we can start from a 
specification level that is more abstract than the 
connection of blocks. 

Two concepts of scaleable parallelism and 
expectation based simulation are introduced. The designer 
10 is given an environment to check the feasibility of what 
the designer thinks that can be done. In the development 
process, the designer creates his library of Signal 
FlowGraph (SFG) versions of abstract MATLAB operations. 

15 Description of QCAPI , a preferred embodiment of the present 
invention 

OCAPI is a C++ library intended for the 
design of digital systems. It provides a short path from a 
system design description to implementation in hardware. 
20 The library is suited for a variety of design tasks, 
including : 

• Fixed Point Simulations 

• System Performance Estimation 

• System Profiling 

25 • Algorithm- to-Archi tec ture Mapping 

• System Design according to a Dataflow Paradigm 

• Verification and Testbench Development 

Development flow 

30 The flow layout 
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The design flow according to an embodiment of 
the present invention, as .shown in figure ID, starts off 
with an untimed, floating point C++ system description 101. 
Since data-processing intensive applications such as all- 
digital transceivers are targeted, this description uses 
data-flow semantics. The system is described as a network 
of communicating components. 

At first, the design is refined, and in each 
component, features expressing hardware implementation are 
introduced, including time (clock cycles) and bittrue 
rounding effects. The use of C++ allows to express this in 
an elegant way. Also, all refinement is done in a single 
environment, which greatly speedups the design effort. 

Next, the timed, bittrue C++ description 103 
is translated into an equivalent HDL description by code 
generation. For each component, a controller description 
105 and a datapath description 107 can be generated. Also 
for each component a single HDL description can be 
generated, this description preferably jointly representing 
the control processing and data processing of the 
component. This is done because OCAPI relies on separate 
synthesis tools for both parts, each one optimized towards 
controller or else datapath synthesis tasks. Through the 
use of an appropriate object modeling hierarchy the 
generation of datapath and controller HDL can be done fully 
automatic . 

For datapath synthesis 109, OCAPI relies on 
the Cathedral-3 datapath synthesis tools, that allow to 
obtain a bitparallel hardware implementation starting from 
a set of signal flowgraphs. Controller synthesis 111 on the 
other hand is done by the logic synthesis of Synopsys DC. 
This divide and conquer strategy towards synthesis allows 
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each tool to be applied at the right place. 

During system simulation, the system stimuli 
113 are also translated into testbenches that allow to 
verify the synthesis result of each component. After 
interconnecting all synthesized components into the system 
netlist, the final implementation can also be verified 
using a generated system testbench 115. 

The system model 
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The system machine model that is used is a 
set of concurrent processes. Each process translates to one 
component in the final system implementation. 

At the system level, processes execute using 
.5 data flow simulation semantics. That is, a process is 
described as an iterative behavior, where inputs are read 
in at the start of an iteration, and outputs are produced 
at the end. Process execution can start as soon as the 
required input values are available. 
1 Inside of each process, two types of 

description are possible. The first one is an untimed 
description, and can be expressed using any C++ constructs 
available. A firing rule is also added to allow dataflow 
simulation. Untimed processes are not subject to hardware 
implementation but are needed to express the overal system 
behavior. A typical example is a channel model used to 
simulate a digital transceiver. 

The second flavor of processes is timed. 
These processes operate synchronously to the system clock. 
One iteration of such a process corresponds to one clock 
cycle of processing. Such a process falls apart in two 
pieces: a control description and a data processing 
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description. 

The control description is done by means of a 
finite state machine, while the data description is a set 
of instructions. Each instruction consists of a series of 
signal assignments, and can also define process in- and 
outputs. Upon execution, the control description is 
evaluated to select one or more instructions for execution. 
Next, the selected instructions are executed. Each 
instruction thus corresponds to one clock cycle of RT 
behavior . 

For system simulation, two schedulers are 
available. A dataflow scheduler is used to simulate a 
system that contains only untimed blocks. This scheduler 
repeatedly checks process firing rules, selecting processes 
for execution as their inputs are available. When the 
system also contains timed blocks however, a cycle 
scheduler is used. The cycle scheduler manages to 
interleave execution of multi -cycle descriptions, but can 
incorporate untimed blocks as well. 



The standard program 



The library of OCAPI has been developed with 
the g++ C++ GNU compiler. The best mode embodiment uses the 
g++ 2.8.1 compiler, and has been successfully compiled and 
run under the HPUX 10 (HPUX10) operating system platform. 
It is also possible to use a g++ 2.7.2 compiler, allowing 
for compilation and run under operating system platforms 
such as HPUX-9 (HPRISC) , HPUX-10 (HPUX10) , SunOS (SUN4) , 
Solaris (SUN5) and Linux 2 . 0 . 0 (LINUX) 

The layout of the 1 standard' g++ OCAPI 
program will be explained, including compilation and 
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linking of this program. 

First of all, g++ is a preferred standard 
compilation environment. On Linux, this is already the case 
after installation. Other operating system vendors however 
usually have their own proprietary C++ compiler. In such 
cases, the g++ compiler should be installed on the 
operating system, and the PATH variable adapted such that 
the shell can access the compiler. 

The OCAPI library comes as a set of include 
files and a binary lib. All of these are put into one 
directory, which is called the BASE directory. 

The * standard program' is the minimal 
contents of an OCAPI program. It has the following layout. 

include ^"qlib.h 1 1 

int main() 

{ 

// your program goes here 

} 



The include "qlib.h" includes everything you 
need to access all classes within OCAPI. 

If this program is called "standard, cxx" , then the 
following makefile will transform the source code into an 
executable for you: 



HOSTTYPE = HPUX10 

t 

BASE = /imec/vsdm/OCAPl/release/v0.9 
CC = g++ 

QFLAGS « -c -g -Wall -I${BASE} 
LIBS = -lm 

% . o : % . cxx 

$(CC) $ (QFLAGS) $< -o $@ 

TARGET = standard 

all: $ (TARGET) 

define lnkqlib 

$(CC) $ A -o $@ $(LIBS) 

endef 

OBJS = standard. o 

standard : $ {OBJS} $ (BASE) /lib$ (HOSTTYPE) qlib . a 
${ lnkqlib} 

clean: 
rm -f *.o $ (TARGET) 



This is a makefile for GNU's "make''; other "make" programs 
can have a slightly different syntax, especially for the 
definition of the "lnkqlib" macro. It is not the shortest 
possible solution for a makefile , but it is one that works 
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on different platforms without making assumptions about 
standard compilation rules. < 

The compilation flags "QFLAGS" mean the following: "-c" 
5 selects compilation-only, "-g" turns on debugging 
information, and "-Wall" is the warning flag. The 
debugging flag allows you to debug your program with "gdb" , 
the GNU debugger. 

10 Even if you don't like a debugger and prefer "printfQ" 
debugging, tt gdb" can at least be of great help in the case 
the program core dumps. Start the program under "gdb" 
(type "gdb standard" at the shell prompt) , type "run" to 
let "standard" crash again, and then type "bt" . One now 

15 see the call trace. 

Calculation 

OCAPI processes both floating point and fixed point values. 
20 In contrast to the standard C++ data types like "int" and 
"double", a "hybrid" data type class is used, that 
simulates both fixed point and floating point behavior. 

The dfix class 

25 

This class is called "dfix". The particular floating/fixed 
point behavior is selected by the class constructor. The 
standard format of this constructor is 

30 dfix a; // a floating point value 

dfix a(0.5);// a floating point value with initial value 
dfix a (0 . 5, 10, 8) ; 
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// a fixed point value with initial value, 

// 10 bits total word-length, 8 fractional bits 

5 A fixed point value has a maximal precision of the mantissa 
precision of a C++ "double" . On most machines, this is 53 
bits. 

A fixed point value can also select a representation, an 
10 overflow behavior, and a rounding behavior. These flags 
are, in this order, optional parameters to the Mfix" 
constructor. They can have the following values. 

• Representation flag: Mfix::tc" for two's complement 
15 signed representation, Mfix: :ns" for unsigned 

representation. 

• Overflow flag: Mfix: :wp" for wrap-around overflow, 
Mfix: :st" for saturation. 

• Rounding flag: Mfix::fl" for truncation (floor), 
20 Mfix::rd" forrounding behavior. 

Some examples are 

dfix a(0.5, 10, 8) ; 
25 // the default is two's complement, wrap-around, 

// truncated quantisation 
dfix a(0.5, 10, 8, dfix::tc, dfix::st, dfix::rd); 

// two's complement, saturation, rounding quantisation 
dfix a(0.5, 10, 8, dfix::ns); 
30 // unsigned, wrap-around, truncated quantisation 

When working with fixed point Mfix"es, it is important to 
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keep the following rule in mind: "quantisation occurs only 
when a value is defined or assigned" . This means that a 
large expression with several intermediate results will 
never have these intermediate values quantised. Especially 
5 when writing code for hardware implementation, this should 
be kept in mind. Also intermediate results are stored in 
finite hardware and therefore will have some quantisation 
behavior. There is however a a "cast" operator that will 
come at help here. 

10 

The dfix operators 
m The operators on u df ix" are shown below 

+ - * / 

Standard addition, subtraction (including 
unary minus) , multiplication and division. 

In-place versions of previous operators, 
abs 

Absolute value. 
<< , >> 

Left and right shifts. 
<<=, >>= 

In place left and right shifts, 
msbpos 

Most-significant bit position. 
6c, |, \ - 

Bitwise and, or, exor, and not operators. 
frac{) (member call) 
Fractional part. 
==, !=, <=, >=, <, > 
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Relational operators: equal, different, 
smaller then .or equal to, greater then or 
equal to, smaller then, greater then. These 
return an w int" instead of a Mf ix" . 

5 

All operators with exception of the bitwise operators work 
on the maximal fixed point precision (53 points) . The 
bitwise operators have a precision of 32 bits (a C++ 
u long") . Also, they assume the fixed point representation 
10 contains no fractional bits. 

In addition to the arithmetic operators, several utility 
methods are available for the w dfix" class. 

15 dfix a,b; 

// cast a to another type 

b = cast(dfix(0, 12, 10), a); 

20 // assign b to a, retaining the quantisation of a 
a = b; 

// assign b to a, including the quantisation 
a. duplicate (b) ; 

25 

/ / return the integer part of b 
int c « (int) b; 

// retrieve the value of b as a double 
30 double d,e: 
d = b.Val() ; 
e = Val (b) ; 



25 



// return quantisation characteristics of a 
a,TypeW(); // returns the number of bits 

a.TypeL{); // returns the number of fractional bits 

5 a.TypeSign() ; // returns dfix: :tc or df ix: :ns 

a.TypeOverf low() ; // returns df ix: :wp or dfix::st 
a ♦ TypeRound ( ) ; // returns dfix::fl or dfix: :rd 

// check if two dfixes are identical in value and 
10 quantisation 

identical (a,b) ; 

// see wether a is floating or fixed point 

a . TypeMode ( ) ; // returns df ix: : f ixpoint or df ix: : f loatpoint 
15 a. isDouble () ; 
a. isFix() ; 

// write a to cout 
cout << a; 

20 

// write a to stdout, in float format, 
// on a field of 10 characters 
write (cout, a, l f ! , 10); 

25 // now use a fixed- format 
write (cout, a, 'g', 10); 

// next assume a is a fixed point number, and write out an 
// integer representation (considering the decimal point at 
// the Isb of a) use a hexadecimal format 
30 write (cout, a, 'x', 10); 

// use a binary format 
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write (cout, a, 'b', 10); 

// use a decimal format t 

write (cout, a, 'd', 10); 

5 // read a from stdin 
cin >> a; 

Communication 

10 Apart from values, OCAPI is concerned with the 
communication of values in between blocks of behavior. The 
high level method of communication in OCAPI is a FIFO 
queue, of type Mfbf ix" . This queue is conceptually 
infinite in length. In practice it is bounded by a sysop 

15 phonecall telling that you have wasted up all the swap 
space of the system. 

The dfbfix class 

20 A queue is declared as 

dfbfix a( vs a"); 

This creates a queue with name a. The queue is intented to 
25 pass value objects of the type u df ix" . There is also an 
alias type of "dfbfix", known as "FB" (flow buffer). So you 
can also write 

FB a (""a 1 ' ) ; 

30 

The dfbfix operations 
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The basic operations on a queue allow to store and retrieve 
Mfix" objects- The operations are 

dfix k; 
5 dfix j (0.5) ; 

dfbfix a(^a' ') ; 

// insert j at the front of a 
a, put ( j ) ; 

10 // operator format for an insert 
a << j ; 

// insert j at position 5, with position 0 corresponding to 
// the front of a. 
15 a. put Index (j ,5) ; 

// read one element from the back of a 
k = a. get () ; 

20 // operator format for a read 
a >> j ; 

// peek one element at position 1 of a 
k = a. get Index (1) ; 

25 

// operator format for peek 
k = atl] ; 

// retrieve one element from a and throw it 
30 a . pop ( ) ; 

// throw all elements, if any, from a 
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a. clear () ; 

// return the number of elements in a as an int 
int n = a.getSizeO; 

5 

// return the name of the queue 
char *p = a. name (); 

Whenever you perform an access operation that reads past 
10 the end of a FIFO, a runtime error results, showing 

Queue Underflow @ get in queue a 

Utility calls for dfbfix 

15 

Besides the basic operations on queues, there are some 
additional utiliy operations that modify a queue behavior 

// make a queue of length 2Q. The default length of a queue 
20 // is 16. Whenever this length is exceeded by a put, the 
// storage in the queue is dynamically expanded by a factor 
// of 2. 

dfbfix a("a' ' , 20) ; 

25 // After the asTypeO call, the queue will have an input 
//"quantizer 1 1 that will quantize each element inserted 
// into the queue to that of the quantizer type 
df ix q{0, 10, 8) ; 
a.asType (q) ; 

30 // After an asDebugO call, the queue is associated with a 
// file, that will collect every value written into the 
// queue. The file is opened as the queue is initialized 
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// and closed when the queue object is destroyed. 
a.asDebugCthisfile.dat 1 1 ) ; 

// Next makes a duplicate queue of a, called b. Every write 
// into a will also be done on b. Each queue is allowed to 
5 // have at most ONE duplicate queue, 
dfbfix br s b n ); 

a. asDup (b) ; 

// Thus, when another duplicate is needed, you write is as 
10 dfbfix cr v c' ' ) ; 

b. asDup (c) ; 

During the communication of Mfix" objects, the queues keep 
track of some statistics on the values that are passed 
15 through it. You' can use the u <<" operator and the member 
function "stattitle () " to make these statistics visible. 

The next program demonstrates these statistics 

20 #include "qlib.h" 

void main() 
{ 

dfbfix a ("a") ; 
25 a << dfix(2) ; 

a << dfix(l) ; 
a << dfix(3) ; 
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a.stattitle (cout) ; 
cout << a; 

} 



30 

When running this program, the following appears on screen 

Name put get MinVal @idx MaxVal @idx Max# @idx 

A 3 0 1.0000e+00 2 3.0000e+00 3 3 3 



The first line is printed by the w stattitle () " call as a 
5 mnemonic for the fields printed below. The next line is the 
result of passing the queue to the standard output stream 
object. The fields mean the following: 



Name The name of the queue 

put The total number of elements "put()" into the 

queue 

get The total number of elements w get()" from the 

queue 

MinVal The lowest element put onto the queue 

@idx The put sequential number that passed this 

lowest element 
MaxVal The highest element put onto the queue 
@idx The put sequential number that passed this 

highest element 
Max# The maximal queue length that occurred 

@idx The put sequential number that resulted ion 

this maximal queue length 



Global s and derivatives for dfbfix 

25 

There are two special derivates of "dfbfix" , Both are 
derived classes such that you can use them wherever you 
would use a Mfbf ix" . Only the first will be discussed 
here, the other one is related to cycle- true simulation and 



is discussed in section "Faster Communications" . 

The w dfbfix_nil" object is like a Vdev/null" drain* Every 
Mfix" written into this queue is thrown. A read operation 
from such a queue results in a runtime error. 

There are two global variables related to queues. The 
"listOfFB" is a pointer to a list of queues, containing 
every queue object you have declared in your program. The 
member function call "nextFBO" will return the successor 
of the queue in the global list. For example, the code 
snippet 

dfbfix *r; 

for { r - listOfFB ; r ; r = r->nextFB() ) 
{ 

} 

will walk trough all the queues present in the OCAPI 
program . 

The other global variable is w nilFB", which is of the type 
*df bf ix_nil" . It is intended to be used as a global 
trashcan. 

The basic block 

OCAPI supports the dataflow simulation paradigm. In order 
to define the actors to the system, one "base" class is 
used, from which all actors will inherit. In order to do 
untimed simulations, one should follow a standard template 
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to which new actor classes must conform. In this section, 
the standard template will . be introduced, and the writing 
style is documented, 

5 Basic block include and code file 

Each new actor in the system is defined with one header 
file and one source code C++ file. We define a standard 
block, "add" , which performs an addition. 

10 

The include file, "add.h", looks like 

#ifndef ADD_H 
#define ADD_H 

15 

#include ^qlib.h» 1 

class add ; public base 
{ 

20 public: 

add (char *name, FB & __inl, FB & __in2, FB & _ol) ; 

int run ( ) ; 
private: 

FB *inl; 
25 FB *in2; 

FB *ol; 

}; 

#endif 

30 

This defines a class "add" , that inherits from "base" . The 
"base" object is the one that OCAPI likes to work with, so 
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you must inherit from it in order to obtain an OCAPI basic 
block. 

The private members in the block are pointers to 
5 communication queues. Optionally, the private members 
should also contain state, for example the tap values in a 
filter. The management of state for untimed blocks is 
entirely the responsibility of the user; as far as OCAPI is 
concerned, it does not care what you use as extra 
10 variables. 

The public members include a constructor and an execution 
call "run" . The constructor must at least contain a name, 
and a list of the queues that are used for communication. 
15 Optionally, some parameters can be passed, for instance in 
case of parametrized blocks (filters with a variable number 
of taps and the like) . 

The contents of the adder block will be described in 
20 "add.cxx". 

#include "add.cxx* 1 

add: :add(char *name, FB & _inl, FB & _in2, FB & _ol) : 
25 base (name) 



inl 



inl.asSource (this) ; 



in2 



in2 . asSource (this) ; 



ol 



ol.asSink (this) ; 
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} 



int add: :run{) 
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{ 

// firing rule 
if (inl->getSize () < 1) 
return 0; 

5 if (in2->getSize () < 1) 

return 0; 

ol->put (inl->get () + in2->get()); 
return 1; 

10 } 

The constructor passes the name of the object to the w base" 
class it inherits from. In addition, it initializes private 
members with the other parameters. In this example, the 

15 communication queue pointers are initialized. This is not 
done through simple pointer assignment, but through 
function calls M asSource" and "asSink" . This is not 
obligatory, but allows OCAPI to analyze the connectity in 
between the basic blocks. Since a queue is intended for 

20 point-to-point communication, it is an error to use a queue 
as input or ouput more then once. The function calls 
w asSource" and "asSink" keep track of which blocks 
source/sink which queues. They will return a runtime error 
in case a queue is sourced or sinked more then once. The 

25 constructor can optionally also be used to perform 
initialization of other private data (state for instance) . 
The w run()" method contains the operations to be performed 
when the block is invoked. The behavior is described in an 
iterative way. The w run" function must return an integer 

3 0 value, 1 if the block succeeded in performing the 
operation, and 0 if this has failed. 
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This behavior consists of two parts: a firing rule and an 
operative part. The firing rule must check for the 
availability of data on the input queues. When no 
sufficient data is present (checked with the "getSizeO" 
5 member call) , it stops execution and returns 0. When 
sufficient data is present, execution can start. Execution 
of an untimed behavior can use the different C++ control 
constructs available. In this example, the contents of the 
two input queues is read, the result is added and put into 
10 the ouput queue. After execution, the value 1 is returned 
to signal the behavior has completed. 

Predefined standard blocks: file sources and sinks 

15 The OCAPI library contains three predefined standard 
blocks, which is a file source "src" , a file sink "snk" , 
and a ram storage block "ram" . 

The file sources and sinks define operating -system 
20 interfaces and allow you to bring file data into an OCAPI 
simulation, and to write out resulting data to a file. The 
examples below show various declarations of these blocks. 
Data in these files is formatted as floating point numbers 
separated by white space. For output, newlines are used as 
25 whitespace. 

// define a file source block, with name a, that will read 
// data from the file" in . dat 1 1 and put it into the queue k 

dfbfix kC^k' ' ) ; 
30 src a( sv a", k, " "in.dat 1 • ) ; 

// an alternative definition is 



dfbfix k(""k' ' ) ; 
src a ("a 1 1 , k) ; 

a.setAttr(src: -.FILENAME, ' ! in.dat 1 1 ) ; 

5 // which also gives you a complex version 
dfbfix kir^kl 1 >) ; 
dfbfix k2(""k2* ') ; 
src a(""a' ' , kl, k2) ; 

a. setAttr (src: : FILENAME, 1 ' in.dat ' 1 ) ; 

10 

// define a sink block b, that will put data from queue o 

// into a file ""out.dat 11 . 

dfbfix o(""o' '} ; 

snk b( sv b M , o, " "out .dat ! 1 ) ; 

15 

// an alternative definition is 
dfbfix o( vv o ,f ); 
snk b(""b' 1 , o) ; 

b. setAttr (snk: : FILENAME, ""out.daf ') ; 

20 

// which gives one also a complex version 
dfbfix ol (""ol 1 1 ) ; 
dfbfix o2 (""o2 ■ 1 ) ; 
snk b (""b' 1 , ol, o2) ; 
25 b.setAttr (snk: : FILENAME, ""out.dat ' 1 ) ; 

// the snk mode has also a matlab-goodie which will format 
// output data into a matrix A that can be read in directly 
//by Mat lab. 
30 dfbfix o(""o' 1 ) ; 

snk b( NN b' o, vs out.m' ') ; 
b.setAttr (snk: : FILENAME, ""out .m 1 ' ) ; 
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b . set At tr ( snk : : MATLABMODE , 1 ) ; 

Predefined standard blocks: RAM 

5 The ram untimed block is intended to simulate single-port 
storage blocks at high level. By necessity, some 
interconnect assumptions had to be made on this block. On 
the other hand, it is supported all the way through code 
generation. 

10 

OCAPI does not generate RAM cells. However, it will 
generate appropriate connections in the resulting system 
netlist, onto which a RAM cell can be connected. 

15 The declaration of a ram block is as follows. 

// make a ram a, with an address bus, a data input bus, a 
// data output bus, a read command line, a write command 
// line, with 64 locations 

20 

df bf ix address ( " "address 1 1 ) ; 
dfbfix data_in ("data-in 1 1 ) ; 
df bf ix data_out ( " v data__out ' 1 ) ; 
dfbfix read_c ( " "read_c 1 1 ) ; 
25 dfbfix write_c ( " "write_c 1 ') ; 

ram a ( " "a ' 1 , address , data__in, data_out , write_c , readme , 64 ) ; 

// clear the ram 
30 a. clear () ; 



// fill the ram with the linear sequence data = kl+address 
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// * k2; 
a.fill(kl, k2); 

// dump the contents of a to cout 
5 a , show ( ) ; 

The execution semantics of the ram are as follows. For each 
read or write, an address, a read command and a write 
command must be presented. If the read command equals 

10 "dfix(l)", a read will be performed, and the value stored 
at the location presented through "address" will be put on 
"data_out" . If the read command equals any other value, a 
dummy byte will be presented at "data_out" . If no read 
command was presented, no data will be presented on 

15 "data_out" . For writes, an identical story holds for reads 
on the "data_in" input: whenever a write command is 
presented, the data input will be consumed. When the write 
command equals 1, then the data input will be stored in the 
location provided through "address" . When a read and write 

20 command are given at the same time, then the read will be 
performed before the write. The ram also includes an online 
"purifier" that will generate a warning message whenever 
data from an unwritten location is read. 

25 Untimed simulations 

Given the descriptions of one or more untimed blocks, a 
simulation can be done. The description of a simulation 
requires the following to be included in a standard C++ 
30 "main ( ) " procedure : 

• The instantiation of one or more basic blocks. 
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• The instantiation of one or more communication queues 
that interconnect the blocks 

• The setup of stimuli. Either these can be included at 
runtime by means of the standard file source blocks, or 

5 else dedicated C++ code can be written that fills up a 
queue with stimuli. 

• A schedule that drives the execution methods of the basic 
blocks . 

10 A schedule, in general, is the specification of the 
sequence in which block firing rules must be tested (and 
fired if necessary) in order to run a simulation. There has 
been quite some research in determining how such a schedule 
can be constructed automatically from the interconnection 

15 network and knowledge of the block behavior. Up to now, an 
automatic mechanism for a general network with arbitrary 
blocks has not been found. Therefore, OCAPI relies on the 
designer to construct such a schedule. 

20 Layout of an untimed simulation 

In this section, the template of the standard simulation 
program will be given, along with a description of the 
w scheduler" class that will drive the simulation. A 
25 configuration with the M adder" block (described in the 
section on basic blocks) is used as an example. 

#include "qlib.h 1 1 
# include ""add.h* 1 

30 



void main{) 

{ 
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dfbfix il<"il") ; 
dfbfix i2 («i2P) ; 
dfbfix ol ("ol") ; 



src SRCl( fl SRCl" , il,"SRCl w ); 

src SRC2 ( " SRC2 » , i2 , " SRC2 " ) ; 

add ADD ( "ADD" , il, i2, ol) ; 

snk SNK1 ( " SNK1 " , ol , " SNK1 " ) ; 



10 schedule S1("S1 ,! ); 

SI. next (SRC1) 
SI . next (SRC2) 
SI -next (ADD ) 
SI. next (SNK1) 



15 



while (SI .run () ) ; 



il . stattitle (cout) ; 
cout << il; 
20 cout << i2; 

cout « ol; 



The simulation above instantiates three communication 
25 buffers, that interconnect four basic blocks. The 
instantiation defines at the same time the interconnection 
network of the simulation. Three of the untimed blocks are 
standard file sources and sinks, provided with OCAPI . The 
"add" block is a user defined one. 

30 

After the definition of the interconnection network, a 
schedule must be defined. A simulation schedule is 



41 

constructed using "schedule" objects. In the example, one 
schedule object is defined, and the four blocks are 
assigned to it by means of a n next()" member call. 



5 The order in which "nextO" calls are done determines the 
order in which firing rules will be tested. For each 
execution of the schedule object "SI", the "run()" methods 
of "SRCl", "SRC2", "ADD" and "SNK1" are called, in that 
order. The execution method of a scheduler object is called 
10 n run()". This function returns an integer, equal to one 
when at least on block in the current iteration has 
executed (i.e. the w run()" of the block has returned one). 
When no block has executed, it returns zero. 

15 The while loop in the program therefore is an execution of 
the simulation. Let us assume that the directory of the 
simulator executable contains the two required stimuli 
files, tt SRCl" and "SRC2" . Their contents is as follows 



20 SRC1 SRC2 not present in the file 
not present in the file 

1 4 

2 5 

3 6 

25 

When compiling and running this program, the simulator 
responds : 

*** INFO: Defining block SRC1 
30 *** INFO: Defining block SRC2 
*** INFO: Defining block ADD 
*** INFO: Defining block SNK1 



Name put get MinVal 

11 3 3 1.0000e+00 

12 3 3 4.0000e+00 
ol 3 3 5.0000e+00 
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@idx MaxVal @idx Max# @idx 

1 3.0000e+00 3 11 

1 6.0000e+00 3 11 

1 9.0000e+00 3 11 



and in addition has created a file W SNK1" , containing 



SNK1 -- not present in the file 

5 no t present in the file 

5.000000e+00 
7.000000e+00 
9.000000e+00 



10 The "INFO" message appearing on standard output are a side 
effect of creating a basic block. The table at the end is 
produced by the print statements at the end of the program. 



More on schedules 

15 

If you would examine closely which blocks are fired in 
which iteration, (for instance with a debugger) then you 
would find 



20 iteration 1 

run SRC1 => il contains 1.0 

run SRC2 => i2 contains 4 . 0 

run ADD => ol contains 5.0 

run SNK1 => write out ol 

25 s chedul e . run ( ) ret urns 1 
iteration 2 

run SRC1 => il contains 2.0 

run SRC2 => i2 contains 5 . 0 
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run ADD => ol contains 7.0 

run SNK1 => write out ol 

schedule.runO returns 1 

iteration 3 

5 run SRC1 => il contains 3.0 

run SRC2 => i2 contains 6.0 

run ADD => ol contains 9.0 

run SNK1 => write out ol 
schedule.runO returns 1 
10 iteration 4 

run SRC1 => at end-of-file, fails 

run SRC2 => at end-of-file, fails 

run ADD no input tokens, fails 

run SNK1 => no input tokens, fails 
15 schedule.runO returns 0 => end simulation 



There are two schedule member functions, u traceOnO" and 
"traceOf f () " , that will produce similar information for 
you. If you insert 

20 

S . traceOn ( ) ; 



just before the while loop, then you see 

25 *** INFO: Defining block SRC1 

*** INFO: Defining block SRC2 

*** INFO: Defining block ADD 

*** INFO: Defining block SNK1 

SI [ SRC1 SRC2 ADD SNK1 ] 
30 SI [ SRC1 SRC2 ADD SNK1 ] 

SI [ SRC1 SRC2 ADD SNK1 ] 

SI [ ] 



44 

Name put get MinVal @idx MaxVal @idx Max# @idx 

il 3 3 1.0000e+00 . 1 3.0000e+00 3 11 

±2 3 3 4.0000e+00 1 6.0000e+00 3 11 

ol 3 3 5.0000e+00 1 9.0000e+00 3 11 

appearing on the screen. This trace feature is convenient 
during schedule debugging. 

5 In the simulation ouput, you can also notice that the 
maximum number of tokens in the queues never exceeds one . 
When you had entered another schedule sequence, for example 

schedule SI ("SI") ; 
10 SI. next (ADD ) ; 

SI .next (SRC2) ; 
SI. next (SRC1) ; 
SI. next (SNK1) ; 

15 then you would notice that the maximum number of tokens on 
the queues would result in different figures. On the other 
hand, the resulting data file, "SNK1", will contain exactly 
the same results. This demonstrates one important property 
of dataflow simulations: any arbitrary but consistent 

20 schedule yields the same results. Only the required amount 
of storage will change from schedule to schedule. 

In multirate systems, it is convenient to have different 
schedule objects and group all blocks working on the same 
25 rate in one schedule. 



Profiling in untimed simulations 
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Untimed simulations are not targeted to circuit 
implementation. Rather, they have an explorative character. 
Besides the queue statistics, OCAPI also enables you to do 
precise profiling of operations. The requirement for this 
5 feature is that 

• You use u schedule" objects to construct the simulation 

• You describe block behavior with Mfix" objects 

10 Profiling is by default enabled. To view profiling results, 
you send the schedule object under consideration to the 
standard output stream. In the "main" example program given 
above, you can modify this as 

15 include ""qlib.h 1 1 
include " v add.h' ' 

void main() 

{ 

20 

schedule SI ("SI") ; 
cout << SI; 

} 

25 

When running the simulation, you will see the following 
appearing on stdout : 

*** INFO: Defining block SRC1 

30 *** INFO: Defining block SRC2 

*** INFO: Defining block ADD 

*** INFO: Defining block SNK1 
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Name put get MinVal @idx MaxVal @idx Max# @idx 

11 3 3 1.0000e+00 . 1 3.0000e+00 3 11 

12 3 3 4.0000e+00 1 6.0000e+00 3 11 
ol 3 3 5.0000e+00 1 9.0000e+00 3 11 



Schedule SI ran 4 times; 

SRC1 3 
SRC2 3 
5 ADD 3 

+ 3 
SNK1 3 



For each schedule, it is reported how many times it was 
10 run. Inside each schedule, a firing count of each block is 
given. Inside each block, an operation execution count is 
given. The simple u add" block gives the rather trivial 
result that there were three additions done during the 
simulation. 

15 

The gain in using operation profiling is to estimate the 
computational requirement for each block. For instance, if 
you find that you need to do 23 multiplications in a block 
that was fired 5 times, then you would need at least five 
20 multipliers to guarantee the block implementation will need 
only one cycle to execute. 

Finally, if you want to suppress operation profiling for 
some blocks, then you can use the member function call 
w noOpsCnt ( ) " for each block. For instance, writing 

25 

ADD . noOpsCnt ( ) ; 



suppresses operation profiling in the ADD block. 
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Implementation ■ 

The features presented in the previous sections contain 
5 everything you need to do untimed, high level simulations. 
These kind of simulations are useful for initial 
development. For real implementation, more detail has to be 
added to the descriptions. 

10 OCAPI makes few assumptions on the target architecture of 
your system. One is that you target bitparallel and 
synchronous hardware. Synchronicity is not a basic 
requirement for OCAPI. The current version however 
constructs single- thread simulations, and also assumes that 

15 all hardware runs at the same clock. If different clocks 
need to be implemented, then a change to the clock- cycle 
true simulation algorithm will have to be made. Also, it is 
assumed that one basic block will eventually be implemented 
into one processor. 

20 

One question that comes to mind is how hardware sharing 
between different basic blocks can be expressed. The answer 
is that you will have to construct a basic block that 
merges the two behaviors of two other blocks. Some 

25 designers might feel reluctant to do this. On the other 
hand, if you have to write down merged behavior, you will 
also have to think about the control problems that are 
induced from doing this merging. OCAPI will not solve this 
problem for you, though it will provide you with the means 

30 to express it. 

Before code generation will translate a description to an 
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HDL, one will have to take care of the following tasks: 

• One will have to specify wordlengths. The target 
hardware is capable of doing bitparallel, fixed point 
5 operations, but not of doing floating point operations. 

One of the design tasks is to perform the quantisation 
on floating point numbers. The Mfix" class discussed 
earlier contains the mechanisms for expressing fixed 
point behavior . 

10 • One will have to construct a clock-cycle true 
description. In constructing this description, one will 
not have to allocate actual hardware, but rather express 
which operations one expects to be performed in which 
clock cycle. The semantical model for describing this 

15 clock cycle true behavior consists of a finite state 

machine, and a set of signal flow graphs. Each signal 
flow graph expresses one cycle of implemented behavior. 
This style of description splits the control operations 
from data operations in your program. In contrast, the 

20 untimed description you have used before has a common 

representation of control and data. 

OCAPI does not force an ordening on these tasks. For 
instance, one might first develop a clock cycle true 
25 description on floating point numbers, and afterwards 
tackle the quantization issues. This eases verification of 
the clock-cycle true circuit to the untimed high level 
simulation. 

30 The final implementation also assumes that all 
communication queues will be implemented as wiring. They 
will contain no storage, nor they will be subject to buffer 
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synthesis. In a dataflow simulation, initial buffering 
values can however be necessary (for instance in the 
presence of feedback loops) . In OCAPI, such a buffer must 
be implemented as an additional processor that incorporates 
the required storage. The resulting system dataflow will 
become deadlocked because of this. The cycle scheduler 
however, that simulates timed descriptions, is clever 
enough to look for these 'initial tokens 1 inside of the 
descriptions. 

In the next sections, the classes that allow you to express 
clock cycle true behavior are introduced. 

Signals and signal flowgraphs 

Some initial considerations on signals are introduced 
first . 

Hardware versus Software 

Software programs always use memory to store variables. In 
contrast, hardware programs work with signals, which might 
or might not be stored into a register. This feature can be 
expressed in OCAPI by using the w _sig" class. Simply 
speaking, a "_sig" is a u dfix" for which one has indicated 
whether is needs storage or not . 

In implementation, a signal with storage is mapped to a net 
driven by a register, while an immediate signal is mapped 
to a net driven by an operator. 

Besides the storage issue, a signal also departs from the 



50 

concept of "scope" one uses in a program. For instance, in 
a function one can use local variables, which are destroyed 
(i.e. for which the storage is reclaimed) after one has 
executed the function. In hardware however, one controls 
5 the signal-to-net mapping by means of the clock signal. 

Therefore one have to manage the scope of signals. The 
signal scope is expressed by using a signal flowgraph 
object, "sfg" . A signal flowgraph marks a boundary on 
10 hardware behavior, and will allow subsequent synthesis 
tools to find out operator allocation, hardware sharing and 
signal -to-net mapping. 

The __sig class and related operations 

15 

Hardware signals can expressed in three flavors. They can 
be plain signals, constant signals, or registered signals. 
The following example shows how these three can be defined. 

20 // define a plain signal a, with a floating point dfix 
// inside of it. 
_sig aT v a' 1 ) ; 

// define a plain signal b, with a fixed point dfix inside 
25 // of it. 

_sigbr"b", dfix(0,10,8) ) ; 

// define a registered signal c, with an initial value k 
// and attached to a clock ck. 
30 dfix k(0.5) ; 
elk ck; 

_sig c ("c ■ • , ck, k) ; 



51 



// define a constant signal d, equal to the value k 
_sig d(k) ; 

5 The registered signals, and more in particular the clock 
object, are explained more into detail when signal 
flowgraphs and finite state machines are discussed. This 
section concentrates on operations that are available for 
signals . 

10 

Using signals and signal operations, one can construct 
expressions. The signal operations are a subset of the 
operations on Mf ix" . This is because there is a hardware 
operator implementation behind each of these operations. 

15 

• +,-,* 

Standard addition, subtraction (including unary minus) , 
multiplication 

• &, |, \ - 

20 Bitwise and, or, exor, and not operators 

• ==/ I=, <=, >=/ </ > 
Relational operators 

• <<, >> 

Left and right shifts 
25 • s.cassign(sl, s2) 

Conditional assignment with si or s2 depending on s 

• cast(T,s) 

Convert the type of s to the type expressed in Mfix" T 

• lu(L,s) 

30 Use s as in index into lookuptable L and retrieve 

• msbpos(s) 
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Return the position of the msb in s 

Precision considerations are the same as for n df ix" . That 
is, precision is at most the mantissa precision of a double 
5 (53 bits) . For the bitwise operations, 32 bits are assumed 
(a long), "cast", "lu" and tt msbpos" are not member but 
friend functions. In addition, "msbpos" expects fixed-point 
signals . 

10 _sig a("a ! 1 ) ; 
_sig bT v b' 1 ) ; 
_sig c ( v "c 1 1 ) ; 

// some simple operations 
15 c = a + b; 
c = a - b; 
c = a * b; 

// bitwise operations works only on fixed point signals 
20 _sig e(dfix(0xff, 10, 0)); 

_sig dCM' 1 ,df ix(0, 10, 0) ) ; 

_sig f T"f " ,dfix(0,10,0) ) ; 

f = d & e; 

f = d | e; 
25 f = ~d; 

f = d A _sig(dfix(3,10,0) ) ; 

// shifting 

//a dfix is automatically promoted to a constant _sig 
30 f » d « dfix(3,8,0) ; 



// conditional assignment 
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f = (d < df ix(2, 10, 0) ) .cassign(e,d) ; 

// tyP e conversion is done with cast 
„sig g("g' ' ,df ix(0, 3 , 0) ) ; 
5 g = cast (dfix(0 / 3 / 0) , d) ; 

// a lookup table is an array of unsigned long 
unsigned long j = {l, 2, 3, 4, 5}; 
// a lookuptable with 5 elements, 3 bits wide 
10 lookupTable j_lookup ( v v j_lookup 1 * , 5, dfix (0,3,0)) = j; 
// find element 2 
g = lu(j_lookup, dfix(2,3,0) ) ; 

If one is interested in simulation only, then one should 
not worry too much about type casting and the like. 
However, if one intends implementation, then some rules are 
at hand. These rules are induced by the hardware synthesis 
tools. If one fails to obey them, then one will get a 
runtime error during hardware synthesis. 

• All operators, apart from multiplication, return a 
signal with the same wordlength as the input signal. 

• Multiplication returns a wordlength that is the sum of 
the input wordlengths. 

• Addition, subtraction, bitwise operations, comparisons 
and conditional assignment require the two input 
operands to have the same wordlength. 

Some common pitfalls that result of this restriction are 
30 the following. 

• Intermediate results will, by default, not expand 
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wordlength. In contrast, operations on dfix do not loose 
precision on intermediate results. For example, shifting 
an 8 bit signal up 8 positions will return you the value 
of zero, on 8 bits. If you want too keep up the 
precision, then you must first cast the operation to the 
desired output wordlength, before doing the shift. 
• The multiplication operator increases the wordlength, 
which is not automatically reduced when you assign the 
result to a signal of smaller with. If you want to 
reduce wordlength, then you must do this by using a cast 
operation. 



For complex expressions, these type promotion rules look a 
bit tedious. They are however used because they allow you 
15 to express behavior precisely downto the bit level. For 
example, the following piece of code extracts each of the 
bits of a three bit signal: 

_sig threebits (dfix (6,3, 0) ) ; 

20 

dfix bit (0,1,0) ; 



_sig bit2(^bit2» ') , bitl Tahiti" ) , bito T'bito • ■ ) ; 

25 bit2 = cast (bit, threebits » dfix(2)) ; 
bitl = cast(bit, threebits >> dfix(l)); 
bitO = cast (bit, threebits) ; 

These bit manipulations were not possible without the given 
30 type promotion rules. 



For hardware implementation, the following operat 



ors are 
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present . 

• Addition and subtraction are implemented on ripple-carry 

adder/subtractors . 

• Multiplication is implemented with a booth multiplier 
block. 

• Casts are hardwired. 

• Shifts are either hardwired in case of constant shifts , 
or else a barrel shifter is used in case of variable 
shifts . 

• Comparisons are implemented with dedicated comparators 

(in case of constant comparisons) , or subtractions (in 
case of variable comparisons) . 

• Bitwise operators are implemented by their direct gate 
equivalent at the bit level. 

• Lookup tables are implemented as PLA blocks that are 
mapped using two-level or multi-level random logic. 

• Conditional assignment is done using multiplexers. 

• Msbit detection is done using a dedicated msbit- 

detector . 

Globals and utility functions for signals 

There are a number of global variables that directly relate 
to the w _sig" class, as well as the embedded w sig" class. 
In normal circumstances, you do not need to use these 
functions. 

The variables w glbNumberOf_Sig" and "glbNumberOf Sig" 
contain the number of w _sig" and w sig" that your program 
has defined. The variable w glbNumberOf Reg" contains the 
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number of "sig" that are of the register type. This 
represents the word-level .register count of your design. 
The "glbSigHashConflicts" contain the number of hash 
conflicts that are present in the internal signal data 
structure organization. If this number is more then, say 5% 
of "glbNumberOf_Sig" , then you might consider knocking at 
OCAPIs complaint counter. The simulation is not bad if you 
exceed this bound, only it will go slower. 

The variable "glbListOf Sig" contains a global list of 
signals in your system. You can go through it by means of 

sig *run; 

for (run = glbListOf Sig; run; run = run->nextsig() ) 
{ 

} 

For each such a "sig", you can access a number of utility 
member functions. 

• "isregister () " returns 1 when a signal is a register. 

• "isconstant ()" returns 1 when a signal is a constant 
value . 

• "istermO" returns 1 when you have defined this signal 
yourself. These are signals which are introduced through 
"_sig()" class constructors. OCAPI however also adds 
signals of its own. 

• "getnameO" returns the "char *" name you have used to 
define the signal. 

• "get_showname ( ) " returns the "char *'' name of the signal 
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that is used for code generation. This is equal to the 
original name, but with a unique suffix appended to it* 



The sfg class 

5 

In order to construct a timed (clocked) simulation, signals 
and signals expressions must be assigned to a signal 
flowgraph. A signal flowgraph (in the context of OCAPI) is 
a container that collects all behavior that must be 
10 executed during one clock cycle. 

The sfg behavior contains 

• A set of expressions using signals 
15 • A set of inputs and outputs that relate signals to 
output and input queues 

Thus, a signal flowgraph object connects local behavior 
(the signals) to the system through communications queues. 
20 In hardware, the indication of input and output signals 
also results in ports on your resulting circuit. 

A signal flowgraph can be a marker of hardware scope. This 
is also demonstrated by the following example. 

25 

_sig a("a' ') ; 
__sig bT v b' ■) ; 
_sig c (df ix(2) ) ; 

30 dfbfix AT "A 1 1 ) ; 
dfbfix Br^B 1 1 ) ; 
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// a signal flowgraph object is created 
sfg add_two, add__three; 

// from now on, every signal expression written down will 
// be included in the signal flowgraph add_two 
add_two . starts ( ) ; 
a = b + c; 

// You must also give a name to add_two, for code 

// generation 

add_two << ""add_two" f ; 

// also, inputs and ouputs have to be indicated. 

// you use the input and ouput objects ip and op for this 

add_two << ip(b, B) ; 

add_two << op (a, A) ; 

// next expression will be part of add__three 
add_three . starts ( ) ; 
a = b + df ix(3) ; 

add_three << ""add_three f 1 ; 
add__three << ip(b,B); 
add_three << op (a, A) ; 

// you can also to semantical checks on signal flowgraphs 
add_two . check ( ) ; 
add_three* check () ; 

The semantical check warns you for the following 
specification errors: 

• Your signal flowgraph contains a signal which is not 



declared as a signal flowgraph input and at the same 
time, it is not a constant or a register. In other 
words, your signal flowgraph has a dangling input. 

• You have written down a combinatorial loop in your 
signal flowgraph. Each signal must be ultimately 
dependent on registered signals, constants, or signal 
flowgraph inputs. If any other dependency exists, you 
have written down a combinatorial loop for which 
hardware synthesis is not possible. 

Execution of a signal flowgraph 

A signal flowgraph defines one clock cycle of behavior. The 
semantics of a signal flowgraph execution are well defined. 

• At the start of an execution, all input signals are 
defined with data fetched from input queues. 

• The signal flowgraph output signals are evaluated in a 
demand driven way. That is, if they are defined by an 
expression that has signal operands with known values, 
then the ouput signal is evaluated. Otherwise, the 
unknown values of the operands are determined first. It 
is easily seen that this is a recursive process. Signals 
with known values are: registered signals, constant 
signals, and signals that have already been calculated 
in the current execution. 

• The execution ends by writing the calculated output 
values to the output queues . 



Signal flowgraph semantics are somewhat related to untimed 
blocks with firing rules. A signal flowgraph needs one 
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token to be present on each input queue. Only, the firing 
rule on a signal flowgraph is not implemented. If the token 
is missing, then the simulation crashes. This is a crude 
way of warning you that you are about to let your hardware 
5 evaluate a nonsense result. 

The relation with untimed block firing rules will allow to 
do a timed simulation which consist partly of signal 
flowgraph descriptions and partly of untimed basic blocks. 
10 The section "Timed simulations will treat this more into 
detail . 

Running a signal flowgraph by hand 

A signal flowgraph is only part of a timed description. The 
control component (an FSM) still needs to be introduced. 
There can however be situations in which you would like to 
run a signal flowgraph directly. For instance, in case you 
have no control component, or if you have not yet developed 
a control description for it. 

The "sfg" member function w run()" performs the execution of 
the signal flowgraph as described above. An example is used 
to demonstrate this. 
25 

#include "qlib.h" 

void main() 
30 { 

_sig a ("a") ; 
_sig b( ,r b") ; 
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_sig c(dfix(2) ) ; 

i 

dfbfix A ("A") ; 
dfbfix B( !, B") ; 

5 

sfg add_two; 
add_two . starts ( ) ; 
a as b + c; 

add__two << "add_two M ; 
10 add__two << ip(b, B) ; 

add_two << op (a, A); 

add_two . check ( ) ; 

15 B « dfix(l) « dfix(2); 

// running silently 

add_two . eval ( ) ; 

cout « A.getO « "\n"; 

20 

// running with debug information 
add_two . eval ( cout ) ; 
cout << A.getO << f, \n"; 

2 5 add_two . eval ( cout ) ; 

} 

When running this simulation, the following appears on the 
screen. 

30 

3 .000000e+00 

add_two ( b 2 ) 
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: a 4 

-> a 4 

4.000000e+00 

add_two (Queue Underflow @ get in queue B 

5 

The first line shows the result in the first "evalO" call. 
When this call is given an output stream as argument, some 
additional information is printed during evaluation. For 
each signal flowgraph, a list of input values is printed. 
10 Intermediate signal values are printed after the at the 

beginning of the line. The output values as they are 
entered in the ouput queues are printed after the w =>" . 
Finally, the last line shows what happens when *eval() w is 
called when no inputs are available on the input queue W B" . 

15 

For signal flowgraphs with registered signals, you must 
also control the clock of these signals. An example of an 
accumulator is given next. 

20 #include "qlib.h" 

void main() 

{ 

elk ck; 

25 

_sig a("a",ck,dfix(0) ) ; 
_sig b( !f b H ) ; 

dfbfix A ("A") ; 
30 dfbfix B("B H ) ; 



sfg accu; 
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accu. starts () ; 
a = a + b; 
accu << "accu"; 
accu << ip (b, B} ; 
accu << op (a, A) ; 
accu. check {) ; 

B « dfix(l) « dfix(2) << dfix(3) ; 
while (B.getSize () ) 

{ 

accu.eval (cout) ; 
accu. tick (ck) ; 

} 

The simulation is controlled in a while loop that will 
consume all input values in queue W B" . After each run, the 
clock attached to registered signal w a" is triggered * This 
is done indirectly through the "sfg" member call "tickO", 
20 that updates all registered signals that have been assigned 
within the scope of this w sfg" * Running this simulation 
results in the following screen ouput 
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The registered signal w a" has two values; a present value 
(shown left of "/")# and a next value (shown right of V" ) • 
When the clock ticks , the next value is copied to the 
5 present value. At the end of the simulation, registered 
signal "a" will contain 6 as its present value. The ouput 
queue U A" however will contain the 3, the "present value" 
of "a" during the last iteration. 

10 Finally, if you want to include a signal flowgraph in an 
untimed simulation, you must make shure that you implement 
a firing rule that guards the sfg evaluation. 

An example that incorporates the accumulator into an 
15 untimed basic block is the following. 

#include "qlib.h" 

class accu : public base 
20 { 

public : 

accu (char *name, dfbfix &i; dfbfix &o) ; 
int run() ; 
private : 
25 dfbfix *ipq; 

dfbfix *opq; 
sfg _accu; 
elk ck; 

} 

30 

accu: : accu (char *name, dfbfix &i, dfbfix &o) : base (name) 
{ 
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ipq = i .asSource (this) ; 
opq = o.asSink(this) ; 

_sig a("a",ck,dfix(0) ) ; 
_sig b("b") ; 
_accu. starts {) ; 
a = a + b; 
_accu << "accu"; 
__accu << ip(b, *ipq) ; 
_accu << op (a, *opq) ; 
_accu . check ( ) ; 

} 

int accu : : run ( ) 

{ 

if (ipq->getSize () < 1) 

return 0; 
_accu . eval ( ) ; 
_accu. tick (ck) ; 

} 

In this example, the signal flowgraph _accu is included 
into the private members of class _accu. 

Globals and utility functions for signal flowgraphs 

The global variable w glbNumberOf Sfg" contains the number of 
w sfg" objects that you have constructed in your present 
OCAPI program. Given an n sfg()" object, you have also a 
number of utility member function calls. 
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• w getname ( ) " returns the "char *" name of the signal 
flowgraph. > 

• "merge ()" joins two signal flowgraphs. 

• "getisig(int n) " returns a "sig *" that indicates which 
signal corresponds to input number n i" of the signal 
flowgraph. If 0 is returned, this input does not exist. 

• "getiqueue (int n) " returns the queue (Mfbfix *") 
assigned to input number "i" of the signal flowgraph. 
If 0 is returned, then this input does not exist. 

• "getosig(int n) " returns a n sig *" that indicates which 
signal corresponds to output number u i" of the signal 
flowgraph. If 0 is returned, this output does not 
exist . 

• "getoqueue (int n) " returns the queue (Mfbfix *") 
assigned to output number "i" of the signal flowgraph. 
If 0 is returned, then this output does not exist. 

You should keep in mind that a signal flowgraph is a data 
structure. The source code that you have written helps to 
build this data structure. However, a signal flowgraph is 
not executed by running your source code. Rather, it is 
interpreted by OCAPI . You can print this data structure by 
means of the "eg (ostream) " member call. 

For example, if you appended 

accu. eg (cout) ; 

to the "running-an-sfg-by-hand" example, then the following 
output would be produced: 



67 



sfg accu 



inputs { b_2 } 

outputs { a_l } 
code { 

a__l = a_l_atl + b_2 ; 

}; 



Finite state machines 



With the aid of signals and signal flowgraphs, you are able 
to construct clock-cycle true data processing behavior. On 
top of this data processing, a control sequencing component 
can be added. Such a controller allows to execute signal 
flowgraphs conditionally. The controller is also the 
anchoring point for true timed system simulation, and for 
hardware code generation. A signal flowgraph embedded in an 
untimed block cannot be translated to a hardware processor: 
you have to describe the control component explicitly. 



The ctlfsm and state classes 



The controller model currently embedded in OCAPI is a 
Mealy- type finite state machine. This type of FSM selects 
the transition to the next state based on the internal 
state and the previous output value. 

In an OCAPI description, you use a "ctlfsm" object to 
create such a controller. In addition, you make use of 
"state" objects to model controller states. The following 
example shows the use of these objects. 



#include ^qlib.h 1 1 
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void main ( ) 
{ 



sfg dummy; 

dummy « ""dummy' 1 ; 

// create a finite state machine 
ctlfsm f ; 



10 // give it a name 

f « ""theFSM' » ; 

// create 2 states for it 

state rst; 

state active; 
15 // give them a name 

rst << "rst ' 1 ; 

active << ""active 1 ' ; 



// identify rst as the initial state of 
20 // ctlfsm f 

f « deflt (rst) ; 

// identify active as a plain state of ctlfsm 
// f 

f << active; 

25 

// create an unconditional transition from 

// rst to active 

rst << all ways << active; 

// allways 1 is a historical typo and will be 
30 // replaced by "always" in the future 

// create an unconditional transition from 
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// active to active, executing the dummy sfg. 
active << allways « dummy << active; 

// show what's inside f 
5 cout « f; 

} 

There are two states in this f sm, "rst" and "active" . Both 
are inserted in the fsm by means of the "<<" operator. In 

10 addition, the "rst" state is identified as the default 
state of the fsm, by -embedding it into the Meflt" object. 
An fsm is allowed to have one default state. When the fsm 
is simulated , then the state at the start of the first 
clock cycle will be "rst" . In the hardware implementation, 

15 a "reset" pin will be added to the processor that is used 
to initialize the fsm's state register with this state. 

Two transitions are defined. A transition is written 
according to the template: starting state, conditions, 
20 actions, target state, all of this separated by the "<<" 
operator. The condition "allways" is a default condition 
that evaluates to true. It is used to model unconditional 
transitions. 

25 The last line of the example shows a simple operation you 
can do with an fsm. By relating it to the output stream, 
the following will appear on the screen when you compile 
and execute the example. 

30 digraph g 

{ 

rst [shape=box] ; 
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rst->active; 
active->active ; 

} 

5 This output represent a textual format of the state 
transition diagram. The format is that of the "dotty" tool, 
which produces a graphical layout of your state transition 
diagram. 

"dotty" is commercial software available from AT&T. 

10 

You cannot simulate a "ctlfsm" object on itself. You must 
do this indirectly through the "sysgen" object, which is 
introduced in the section "Timed Simulations" . 

15 The end class 

Besides the default condition "allways" , you can use also 
boolean expressions of registered signals. The signals need 
to be registered because we are describing a Mealy- type 
20 fsm. You construct conditions through the "end" object, as 
shown in the next example. 

#include "qlib.h" 

25 void main() 

{ 

elk ck; 

_sig a{ ,t a»,ck, dfix(0)); 
_sig b( n b*,ck, dfix(0)); 
30 _sig a_input ("a" ) ; 

_sig b_input ("a") ; 
dfbf ix A ("A") ; 
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dfbfix B("B") ; 

sfg some_operation; 

// some operations go here . . . 

5 

sfg readcond; 
readcond . starts { ) ; 
a = a_input; 
b = b_input; 
10 readcond << "readcond"; 

readcond << ip (a_input , A) ; 
readcond « ip (binput , B) ; 
readcond. check () ; 

15 // create a finite state machine 

ctlfsm f; 
f « "theFSM"; 



state rst; 
20 state active; 

state wait; 



rst << M rst ir ; 
active << "active"; 
25 wait << "wait"; 

f « deflt (rst) ; 
f « active; 
f << wait; 



30 



rst << allways << readcond << active; 
active << _cnd(a) << readcond << some_operation 
<< wait; 
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wait << (_cnd(a) && __cnd(b)) << readcond 

<< wait; 

wait <<(!_cnd(a) | | !__cnd(b) ) <<readcond<< active; 

} 

A FAQ is why condition signals must be registers, and 
whether they can be plain signals also. The answer is 
simple: no, they can't. The fsm control object is a stand- 
alone machine that must be able to 'boot* every clock 
cycle. During one execution cycle, it will first select the 
transition to take (based on conditions) , and then execute 
the signal flowgraphs that are attached to this transition. 
If "immediate" transition conditions had to be expressed, 
then the signals should be read in before the fsm 
transition is made, which is not possible: the execution of 
an sfg can only be done when a transition is selected, in 
other words: when the condition signals are known. Besides 
this semantical consideration, the registered-condition 
requirement will also prevent you from writing 
combinatorial control loops at the system level. 

The first signal flowgraph "readcond" takes care of reading 
in two values "a" and "b" that are used in transition 
conditions. The sfg reads the signals "a" and "b" in 
through the intermediate signals "a_input" and "b_input" . 
This way, "a" and "b" are explicitly assigned in the signal 
flowgraph, and the semantical check "readcond. check () " will 
not complain about unassigned signals. 

The fsm below it defines three states. Besides an initial 
state "rst" and an operative state "active", a wait state 
"wait" is defined, that is entered when the input signal 
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"a" is high. This is expressed by the w _cnd(a)" transition 
condition in the second 'fsm transition. You must use 
*\_cnd()" instead of "cnd()" because of the same reason that 
you must use u _sig()" instead of "sigO": The underscore- 
type classes are empty boxes that allocate the objects that 
do the real work for you. This allocation is dynamic and 
independent of the C++ scope. 

Once the wait state is entered, it can leave it only when 
the signals "a" or u b" go low. This is indicated in the 
transition condition of the third fsm transition. A 
operator is used to express the and condition. If the 
signals "a" and n b" remain high, then the wait state is not 
left. The transition condition of the last transition 
expresses this. It uses the logical not and logical or 

w ||" operators to express this. 

The "readcond" signal flowgraph is executed at all 
transitions. This ensures that the signals "a" and "b" are 
updated every cycle. If you fail to do this, then the value 
of "a" and "b" will not change, potentially creating a 
deadlock. 

To summarize, you can use either "always" or a logical 
expression of u __cnd()" objects to express a transition 
condition. The signals use in the condition must be 
registers. This results in a Mealy-type fsm description 

Utility functions for fsm objects 

A number of utility functions on the "ctlfsm" and "state" 
classes are available for query purposes. This is only 
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minimal: The objects are intended to be manipulated by the 
cycle scheduler and code generators. 

sfg action; 
5 ctlfsm f; 
state si; 
state s2 ; 

f << deflt (si) ; 
10 f << s2; 

si << allways << s2; 

s2 << allways « action << si; 

15 // run through all the state in f 
statelist *r; 

for (r = f. first; r; r = r->next) 
{ 

20 } 

// print the nuymber of states in f, 
// print the number of transitions in f , 
// print the name of f, 
II print the number of sfg f s in f 
25 cout << f . numstates ( ) « "\n l( ; 

cout << f .numtransitions {) << ""\n ( '; 
cout << f.getname() « " v \n fl ; 
cout << f .numactions () << ""\n ,f ; 

30 // print the name of a state 

cout << sl.getnameO << vv \n !l ; 
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The basic block for timed simulations 

Using signals, signal flowgraphs, finite state machines and 
states, you can construct a timed description of a block. 
5 Having obtained such a description, it is convenient to 
merge it with the untimed description. This way, you will 
have one class that allows both timed and untimed 
simulation. Of course, this merging is a matter of writing 
style, and nothing forces you to actually have both a timed 
10 and untimed description for a block. 

The basic block example, that was introduced in the section 
"The basic block" , will now be extended with a timed 
version. As before, both an include file and a code file 
15 will be defined. The include file, "add.h", looks like the 
following code. 

#ifndef ADD_H 
#define ADD_H 

20 

#include ^qlib.h' 1 
class add : public base 

{ 

25 public: 

add (char *name, FB & _inl, FB & _in2, FB & _ol) ; 

// untimed 
int run() ; 

30 

// timed 

void define () ; 



ctlfsm &fsm() {return _fsm} ; 
private ; 

FB *inl; 
FB *in2; 
FB *ol; 
ctlfsm _fsm; 
sfg _add; 
state _go; 

}; 

#endif 

The private members now also contain a control fsm object, 
in addition to signal flowgraph objects and states. If you 
feel this is becoming too verbose, you will find help in 
the section "Faster description using macros", that defines 
a macro set that significantly accelerates description 
entry. 

In the public members, two additional member functions are 
declared: the "define ()" function, which will setup the 
timed description data structure, and the w fsm()", which 
returns a pointer to the fsm controller. Through this 
pointer, OCAPI accesses everything it needs to do 
simulations and code generation. 

The contents of the adder block will be described in 
"add.cxx" . 

#include "add.h 1 1 

add: :add(char *name, FB & _inl, FB & _in2, FB & ol) : 
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base (name) 
{ 



inl = _inl .asSource (this) ; 
in2 = _in2 .asSource (this) ; 
ol = _ol.asSink (this); 
define () ; 



} 



int add: :run() 
10 { 



} 



void add: : define () 
15 { 



_sig il C"il' 1 ) 
_sig i2T v i2* ') 
_sig ot ( " "ot ' ' ) 



20 



25 



_add << vv add ,f ; 

_add. starts () ; 

ot = il + i2; 

__add << ip(il, *inl) ; 

__add << ip(i2, *in2) ; 

_add << op(ot, *ol) ; 



fsm « v ^fsm f 



go << " "go* ' ; 



30 



_f sm << def It (_go) ; 

_go << all ways << _add << _go; 
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If the timed description' uses also registers , then a 
pointer to the global clock must also be provided (OCAPI 
generates single-clock, synchronous hardware) . The easiest 
way is to extend the constructor of "add" with an 
additional parameter "elk &ck" , that will also be passed to 
the "define" function. 

Timed simulations 

By obtaining timed descriptions for you untimed basic 
block, you are now ready to proceed to a timed simulation . 
A timed simulation differs from an untimed one in that it 
proceeds clock cycle by clock cycle. Concurrent behavior 
between different basic blocks is simulated on a cycle-by- 
cycle basis. In contrast, in an untimed simulation, this 
concurrency is present on an iteration by iteration basis. 

The sysgen class 

The "sysgen" object is for timed simulations the equivalent 
of a "scheduler" object for untimed simulations. In 
addition, it also takes care of code and testbench 
generation, which explains the name. 

The sysgen class is used at the system level. The timed 
"add" class, defined in the previous section, is used as an 
example to construct a system which uses untimed file 
sources and sinks, and a timed "add" class. 

#include "qlib.h 1 1 
#include "add.h 1 1 



void main() 
{ 

dfbfix il ("il") ; 
dfbfix i2 ( ,f i2") ; 
dfbfix ol ("ol") ; 



src SRC1 ("SRCl", il,"SRCl»); 

src SRC2 ( " SRC2 " , i2 , " SRC2 " ) ; 

add ADD ("ADD" , il, i2, ol) ; 

snk SNK1 ( " SNK1 " , ol 7 " SNK1 " ) ; 



sysgen SI ("SI") ; 



SI « SRCl; 

SI « SRC2; 

SI << ADD. f sm () ; 

SI « SNK1; 

SI .setinfo (verbose) ; 

elk ck; 

int i ; 

for (i=0; i<3; i++) 

{ 

SI . run (ck) ; 

} 



The simulation is set up as before with queue objects and 
basic blocks. Next, a "sysgen" object is created, with name 
"SI" . All basic blocks in the simulation are appended to 
the "sysgen" objects by means of the $<<$ operator. If a 
timed basic block is to be used, as for instance in case of 
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the M add" object, then the "fsm()" pointer must be 
presented to "sysgen" rather then the basic block itself. A 
"sysgen" object knows how to run and combine both timed and 
untimed objects. For the description shown above, untimed 
5 versions of the file sources and sink "src" and "snk" will 
be used, while the timed version of the "add" object will 
be used. 

Next, three clock cycles of the system are run. This is 
10 done by means of the M run(ck)" member function call of 
n sysgen" . The clock object "ck" is, because this simulation 
contains no registered signals, a dummy object. When 
running the simulator executable with stimuli file contents 

SRC1 SRC2 not present in the file 
not present in the file 

1 4 

2 5 

3 6 

you see the following appearing on the screen. 

*** INFO: Defining block SRC1 
*** INFO: Defining block SRC2 
25 *** INFO: Defining block ADD 

*** INFO: Defining block SNK1 
fsm fsm: transition from go to go 
add#0 
add#l 

30 in il 1 

in i2 4 
sig ot 5 



15 



20 
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out 1 ot 5 

fsm fsm: transition from go to go 
add#0 
add#l 

5 in il 2 

in i2 5 

sig ot 7 

out 1 ot 7 

fsm fsm: transition from go to go 
10 add#0 

add#l 

in il 3 

in i2 6 

sig ot 9 

15 out 1 ot 9 

The debugging output produced is enabled by the "setinfoO" 
call on the "sysgen" object. The parameter "verbose" 
enables full debugging information. For each clock cycle, 

20 each fsm responds which transition it takes. The fsm of the 
"add" block is called "fsm", an as is seen it makes 
transitions from the single state "go" to the obvious 
destination. Each signal flowgraph during this simulation 
is executed in two phases {below it is indicated why) . 

25 During simulation, the value of each signal is printed. 

Selecting the simulation verbosity 

The "setinfo" member function call of "sysgen" selects the 
30 amount of debugging information that is produced during 
simulation. Four values are available: 



• "silent" will cause no output at all. This can 
significantly speed up your simulation, especially for 
large systems containing several hundred of signal 
f lowgraphs . 

5 • "terse" will only print the transitions that fsm f s make. 

• "verbose" will print detailed information on all signal 
updates . 

• "regcontents" will print a list the values of registered 
signals that change during the current simulation. This 

10 is by far the most interesting option if you are 

debugging at the system level: when nothing happens, for 
instance when all your timed descriptions are in some 
"hold" mode, then no ouput is produced. When there is a 
lot of activity, then you will be able to track all 

15 registered signals that change. 



This example is part of a simulation containing 484 
registerd signals and 483 signal f lowgraphs. Using 
"setinfo (verbose) " here might require a good text editor to 
20 see what is happening - if anything will happen before your 
quota is exceeded. 



For instance, the code fragment 

25 sysgen S ("S 1 1 ) ; 

S . set info (regcontents) ; 



int cycle; 

for (cycle=0; cycle < 100; cycle++) 
30 { 

cout « vv > Cycle " « cycle << "\n''; 
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S . run (ck) ; 

} 

can produce an output as shown below. 
> Cycle 18 



> Cycle 19 

> Cycle 20 

> Cycle 21 



coe f _ram_i r_2 


0 


1 


copy_s t ep_f 1 ag 


1 


0 


ext_ready out 


1 


0 


pc 


15 


16 


step_f lag 


1 


0 


coef ram ir 2 


1 


o 


coef__wr__adr 


12 


13 


hold pc 


0 


16 


pc 


16 


17 




1 


0 


step__clock 


0 


1 


copyjs t ep_f 1 ag 


0 


1 


prev_step_clock 


0 


1 


step__f lag 


0 


1 



Three phases are better 

Although you will be saved from the details behind two- 
phase simulation, it is worthwhile to see the motivation 
behind it. 

When you run an u sfg" w by hand" using the w run()" method of 
an "sfg", the simulation proceeds in one phase: read 
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inputs, calculate, produce ouput. The w sysgen" object, on 
the other hand, uses a two-phase simulation mechanism. 

The origin is the following. In the presence of feedback 
5 loops, your system data flow simulation will need initial 
values on the communication queues in order to start the 
simulation. However, the code generator assumes the 
communication queues will translate to wiring. Therefore, 
there will never be storage in the implementation of a 

10 communication queue to hold these intitial values. OCAPI 
works around this by producing these initial values at 
runtime. This gives rise to a three-phase simulation: in 
the first phase, initial values are produced, while in the 
second phase, they are consumed again. This process repeats 

15 every clock cycle. 

The three-phase simulation mechanism is also able to detect 
combinatorial loops at the system level. If there exists 
such a loop, then the first phase of the simulation- will 

20 not produce any initial value on the system interconnect. 
Consequently, in the last phase there will be at least one 
signal flowgraph that will not be able to complete 
execution in the current clock cycle. In that case, OCAPI 
will stop the simulation. Also, you get a list of all 

25 signal flowgraphs that have not completed the current clock 
cycle, in addition to the queue statistics that are 
attached to these signal flowgraphs. 

Hardware code generation - 

30 

OCAPI allows you to translate all timed descriptions to a 
synthesizable hardware description. 
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• For each timed description, you get a datapath ".dsfg" 

file, that can be entered into the Cathedral -3 datapath 
synthesis environment, converted to VHDL and 
5 postprocessed by Synopsys-dc logic synthesis. 

• For each timed description, you also get a controller 

".dsfg" file, which is synthesized through the same 
environment . 

• You also get a glue cell, that interconnects the 
10 resulting datapath and controller VHDL file. 

• You get a system interconnect file, that integrates all 
glue cells in your system. For this system interconnect 
file, you optionally can specify system inputs and 
outputs, scan chain interconnects, and RAM 

15 interconnects. The file is VHDL. 

• Finally, you also get debug information files, that 
summarize the behavior of and ports on each processor. 



Untimed blocks are not translated to hardware. The use of 
20 the actual synthesis environments will not be discussed in 
this section. It is assumed to be known by a person skilled 
in the art. 



The generate () call 

25 

The member call "generate ()" performs the code generation 
for you. In the adder example, you just have to add 

SI .generate {) ; 

30 

at the end of the main function. If you would compile this 
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description, and run it, then you would see things are not 
quite OK: 



*** INFO: Generating Systen Link Cell 
5 *** INFO: Component generation for SI 

*** INFO: C++ currently defines 5 sig, 4 _sig, 1 sfg. 
*** INFO: Generating FSMD fsm 
*** INFO: FSMD fsm defines 1 instructions 
DSFGgen: signal il has no wordlength spec. 
10 DSFGgen: signal i2 has no wordlength spec. 
DSFGgen: signal ot has no wordlength spec. 
DSFGgen: not all signals were quantized. Aborting. 
*** INFO: Auto- cleanup of sfg 

15 Indeed, in the adder example up to now, nothing has been 
entered regarding wordlengths. During code generation, 
OCAPI does quite some consistency checking. The general 
advice in case of warnings and errors is: If you see an 
error or warning message, investigate it. When- you 

20 synthesize code that showed a warning or error during 
generation, you will likely fail in the synthesis process 
too. 



The u add" description is now extended with wordlengths. 8 
25 bit wordlengths are chosen. You modify the w add" class to 
include the following changes. 



void add: : define () 
{ 

30 dfix wl (0,8,0) ; 

_sig il (""il ' ' ,wl) ; 
_sig i2 T"i2 ' » ,wl) ; 
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__sig ot ( " "ot ' 1 , wl ) ; 
... > 

} 

After recompiling and rerunning the OCAPI program, you now 
see : 

*** INFO: Generating Systen Link Cell 
*** INFO: Component generation for SI 

*** INFO: C++ currently defines 5 sig, 4 _ sig, 1 sfg. 

*** INFO: Generating FSMD fsm 

*** INFO: FSMD fsm defines 1 instructions 

*** INFO: C++ currently defines 31 sig, 21 _sig, 3 sfg. 

*** INFO: Auto-cleanup of sfg 

In the directory where you ran this, you will find the 
following files: 

• "f sm_dp.dsfg" , the datapath description of "add" 

• "f sm_f sm.dsfg" , the controller description of "add" 

• "fsm.vhd", the glue cell description of add 

• "Sl.vhd", the system interconnect cell 

• "fsm. ports", a list of the I/O ports of "add". 

The glue cell "fsm.vhd" has the following contents (only 
the entity declaration part is shown) . 

-- Cath3 Processor for FSMD design fsm 

library IEEE; 

use IEEE. std_logic_l 164 .all; 
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entity fsm is 

port ( 

reset: in std_logic; 

5 elk: in std_logic; 

il: in std_logic__vector ( 7 downto 0 ); 

i2 : in std_logic_vector ( 7 downto 0 ) ; 

ot: out std__ logic_vector ( 7 downto 0 ) 

); 

10 end fsm; 



Each processor has a reset pin, a clock pin, and a number 
of I/O ports, depending on the inputs and ouputs defined in 
the signal flowgraphs contained in this processor. All 

15 signals are mapped to w std_logic" or w std__logic_vector" . 
The reset pin is used for synchronous reset of the embedded 
finite state machine. If you need to initialize registered 
signals in the datapath, then you have to describe this 
explicitly in a signal flowgraph, and execute this upon the 

20 first transition out of the initial state. 



The M fsm. ports" file, indicates which ports are read in in 
each transition. In the example of the "add" class, there 
is only one transition, which results in the following 
25 *. ports" file 



********** SFG fsmgogoO ********** 
Port # I/O Port Q 

1 I il il 

30 2 I i2 i2 

1 O ot ol 
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The name of an input or output signal is used as a port 
name, while the name of the »queue associated to it relates 
to the system net name that will be connected to this port. 

5 System cell refinements 

The system link cell incorporates all glue cells of your 
current timed system description. These glue cells are 
connected if they read/write from the same system queue. 
10 There are some refinements possible on the "sysgen" object 
that will also allow you to indicate system level inputs 
and ouputs, scan chains, and RAM connections. 

System inputs and ouputs are indicated with the "inpadO" 
15 and "outpadO" member calls of "sysgen" . In the example, 
this is specified as 

sysgen SI ("SI 1 1 ) ; 

20 

dfix b8 (0,8,0) ; 

Sl.inpad(il, b8) ; 
SI . inpad(i2, b8) ; 
25 Sl.out:pad(ol, b8) ; 

Making these connections will make the w il", "12", w ol" 
signals appear in the entity declaration of the system cell 
W S1". The entity declaration inside of the file "Sl.vhd" 
30 thus looks like 



entity SI is 
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port ( 



reset : 



in std_logic; 



elk: 



in std_logic; 



5 



Ol: 



il: 



i2: 



in std_logic_vector ( 7 downto 0 ) ; 
in std_JLogic_vector { 7 downto 0 ) ; 
out std_logic_vector ( 7 downto 0 ) 



); 



end SI; 



10 Scan chains can be added at the system level, too. For each 
scan chain you must indicate which processors it should 
include. Suppose you have three basic blocks (including a 
timed description and registers) with names "BL0CK1" , 
u BLOCK2", n BL0CK3" . You attach the blocks to two scan 

15 chains using the following code. 

scanchain SCAN1 ( " scanl 11 ) ; 
scanchain SCAN2 ("scan2" ) ; 

20 SCAN1 . addscan ( & BL0CK1 . f sm ( ) ) ; 
SCAN1 . addscan ( & BLOCK2 . f sm ( ) ) ; 
SCAN2 . addscan (& BLOCK3 . f sm ( ) ) ; 

The w sysgen" object identifies the required scan chain 
25 connections through the w fsm" objects that are assigned to 
it. In order to have reasonable circuit test times, you 
should not include more then 300 flip-flops in each scan 
chain. If you have a processor that contains more then 3 00 
flip-flops, then you should use another scan chain 
30 connection strategy. 



Finally, you can generate code for the standard untimed 
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block RAM. There are two possible interconnection 
mechanisms: the first will include the untimed RAM blocks 
in "sysgen" as internal components of the system link cell. 
The second will include the RAM blocks as external 
5 components- This latter method requires you to construct a 
new * system- system link cell", that includes the RAM 
entities and the system link cell in a larger structure. 
However, it might be required in case you have to remap the 
standard RAM interface, or introduce additional 
10 asynchronous timing logic. 

An example of the two methods is shown next 

ram RAMI ( "rami" , addrl, dil, dol, wr, rd, 128); 
15 ram RAM2 ( w ram2" , addr2, di2, do2 , wr, rd, 128); 

// types of address and data bus 
dfix addrtype(0, 7, 0) ; 
dfix dattype (0, 4, 0) ; 

20 

sysgen S1("S1 M ); 

/ / define an external ram 

SI . extern_ram (RAMI , addrtype , dattype) ; 

25 

// define an internal ram 

SI . intern_ram(RAM2 , addrtype, dattype) ; 

Pitfalls for code generation 

30 

As always, there are a number of pitfalls when things get 
complex. You should watch the following when diving into 
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code generation. 

OCAPI generates nicely formatted code, that you can 
investigate. To help you in this process, also the actual 
5 signal names that you have specified are regenerated in the 
VHDL and DSFG code. This implies that you have to stay away 
from VHDL and DSFG keywords, or else you will get an error 
from either Cathedral-3 or Synopsys. 

10 The mapping of the fixed point library to hardware is, in 
the present release, minimal. First of all, although 
registered signals allow you to specify an initial value, 
you cannot rely on this for the hardware circuit. 
Registers, when powered on, take on a random state. 

15 Therefore, make sure that you specify the initialization 
sequence of your datapath. A second fixed point pitfall is 
that the hardware support for the different quantization 
schemes is lacking. It is assumed that you finally will use 
truncated quantization on the lsb-side and wrap-around 

20 quantization on the msb-side of all signals. The other 
quantization schemes require additional hardware to be 
included. If you really need, for instance, saturated msb 
quantization, then you will have to describe it in terms of 
the default quantization. 

25 

Finally, the current set of hardware operators in 
Cathedral-3 is designed for signed representations. They 
work with unsigned representations also as long as you do 
no use relational operations {<, > and the like) . In this 
30 last case, you should implement the unsigned operation as a 
signed one with one extra bit. 
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Verification and testbenches 

Once you have obtained a gate level implementation of your 
circuit, it is necessary to verify the synthesis result. 
5 OCAPI helps you with this by generating testbenches and 
testbench stimuli for you while you run timed simulations 
and do code generations. 



The example of the u add" class introduced previously is 
10 picked up again, and testbench generation capability is 
included to the OCAPI- description. 



Generation of testbench vectors 



15 The next example performs a three cycle simulation of the 
"add" class and generates a testbench vectors for it . 



#include "qlib.h" 



20 void main() 
{ 

dfbfix il ("il") ; 
dfbfix ±2 ("i2") ; 
dfbfix ol("ol") ; 

25 

11, "SRC1") ; 

12, " SRC2 " ) ; 
il, i2, ol); 
ol, "SNK1") ; 



src SRC1("SRC1", 
src SRC2("SRC2" / 
add ADD ( "ADD" , 
snk SNK1("SNK1" 



sysgen SI ("SI") ; 
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SI << SRC1; 
SI << SRC2; 
SI « ADD.fsmO ; 
SI « SNK1; 
5 ADD . f sm ( ) „ tb_enable ( ) ; 

elk ck; 
int i ; 

for (i=0; i<3; i++) 
10 Sl.run(ck); 

ADD . f sm ( ) . tb_data ( ) ; 

} 

15 

Just before the timed simulation starts, you enable the 
generation of testbench vectors by means of a w tb_enable () " 
member call for each fsm that requires testbench vectors. 

20 During simulation, the values on the input and ouput ports 
of the "add" processor are recorded. After the simulation 
is done, the testbenches are generated using a w tb\_ dataO" 
member function call. 

25 Testbench generation leaves three data files behind: 

• "f sm_tb.dat" contains binary vectors of all inputs of 
the "add" processor. It is intended to be read in by the 
VHDL simulator as stimuli. 
30 • w f sm_tb . dat_hex" contains hexadecimal vectors of all 
inputs and outputs of the "add" processor. It contains 
the output that should be produced by the VHDL simulator 
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when the synthesis was successful. 
• w f sm__tb.dat_info" documents the contents of the stimuli 
files by saying which stimuli vector corresponds to 
which signal 

When compiling and running this OCAPI program, the 
following appears on screen. 

*** INFO: Defining block SRC1 
*** INFO: Defining block SRC2 
*** INFO: Defining block ADD 
*** INFO: Defining block SNK1 

*** INFO: Creating stimuli monitor for testbench of FSMD 
f sm 

*** INFO: Generating stimuli data file for testbench 
f sm__tb, 

*** INFO: Testbench fsm_tb has 3 vectors. 

Afterwards, you can take a look at each of the three 
generated testbenches. 

file: fsm_tb.dat 

00000001 00000100 

00000010 00000101 

00000011 00000110 
file: f sm_tb . dat_hex 

01 04 05 

02 05 07 

03 06 09 

file: f sm_tb.dat_info 
Stimuli for fsm_tb contains 3 vectors for 
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read 
read 

Next columns occur only in _hex.dat file and are outputs 

5 

ol_stim write 

You can now use the vectors in the simulator. But first, 
you must also generate a testbench driver in VHDL. 

10 

Generation of testbench drivers 

To generate a testbench driver, simply call the 
"tb_enable () " member function of the "add" fsm before you 
15 initiate code generation. You will end up with a VHDL file 
w f sm_tb. vhd" that contains the following driver. 

Test Bench for FSMD design fsm 

20 library IEEE; 

use IEEE.std_ logic_1164 .all; 

use IEEE . std_logic_ textio . all ; 
use std . text io . all ; 

25 

library clock; 

use clock. clock. all; 

entity fsm_tb is 
30 end fsm_tb; 

architecture rtl of fsm tb is 



il_stim 
i2 stim 
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signal reset: std_logic; 
signal elk: std_logic; 

signal il: std__logic_vector ( 7 downto 0 ); 
signal i2 : std__logic_vector ( 7 downto 0 ) ; 
5 signal ot: std_logic_vector ( 7 downto 0 ); 

component fsm 
port ( 

reset: in std_logic; 
elk: in std_logic; 
10 il : in std__logic_vector ( 7 downto 0 ) ; 

i2: in std_logic_vector ( 7 downto 0 ); 
;fa * ot : out std_JLogic_vector ( 7 downto 0 ) 

f ) ; 

Si end component; 

HI begin 

crystal (elk, 50 ns) ; 
y f sm_dut : fsm 

Q port map ( 

2T 20 reset => reset, 

H elk => elk, 

11 => il, 

12 => i2, 
ot => ot 

25 ) ; 

ini : process 
begin 

reset <= 1 1 1 ; 

wait until elk 1 event and elk = 'I 1 ; 
30 reset <= 1 0 1 ; 

wait; 

end process; 
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input : process 

file stimuli : text is in "fsm__tb.dat"; 
variable aline : line; 
5 file stimulo : text is out "f sm_tb. sim__out " ; 

variable oline : line; 

variable v_il: std__logic_vector ( 7 downto 0 ); 
variable v_i2 : std_logic_vector ( 7 downto 0 ) ; 
variable v_ot : std_logic_vector ( 7 downto 0 ) ; 
10 variable v__il__hx: std_logic_vector ( 7 downto 0 ) 

variable v_i2_hx: std__logic_vector ( 7 downto 0 ) 
variable v_ot_hx: std_logic_vector ( 7 downto 0 ) 
begin 

wait until reset 1 event and reset = ! 0 f ; 
15 loop 

if (not (endfile (stimuli) ) ) then 
readline (stimuli, aline) ; 
read (aline, v_il) ; 
read (aline, v_i2) ; 
20 else 

assert false 

report "End of input file reached" 
severity warning; 

end if; 
25 il <= v_il; 

i2 <= v_i2; 

wait for 50 ns; 

v_ot := ot; 

v_il_hx := v_il; 
30 v_i2_hx := v_i2 ; 

v_ot_hx := v_ot; 

hwrite (oline, v_il_hx) ; 
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write (oline, 1 1 ) ; 

hwrite (oline, Vj_i2_hx) ; 

write (oline, 1 1 ) ; 

hwrite (oline, v_ot_hx) ; 
5 write (oline, 1 f ); 

writeline (stimulo, oline) ; 

wait until elk 'event and elk = 1 1'; 
end loop; 
end process; 
10 end rtl; 

configuration tbcjrtl of fsm_tb is 
for rtl 

for all : fsm 
15 use entity work. fsm (structure) ; 

end for; 
end for; 
end tbc_rtl; 

20 The testbench uses one additional library, "clock", which 
contains the "crystal" component. This component is a 
simple clock generator that drives a 50% duty cycle elk. 

This testbench will generate a file w f sm_tb. sim_out" . After 
25 running the testbench in VHDL, this file should be exactly 
the same as the "f sm__tb. dat_hex" . You can use the unix 
Miff" command to check this. The only possible differences 
can occur in the first few simulation cycles, if the VHDL 
simulator initializes the registers to n X" . 

30 

Using automatic testbench generation greatly speedups the 
verification process. You should consider using it whenever 
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you are into code generation. 

Compiled code simulations 

5 For large designs, simulation speed can become prohibitive. 
The restricting factor of OCAPI is that the signal 
flowgraph data structures are interpreted at runtime. In 
addition, runtime quantization (fixed point simulation) 
takes up quite some CPU power, 

10 

OCAPI allows you to - generate a dedicated C++ simulator, 
that runs compiled code instead of interpreted code. Also, 
additional optimizations are done on the fixed point 
simulation. The result is a simulator that runs one to two 
15 orders of magnitude faster then the interpreted OCAPI 
simulation. This speed increase adds up to the order of 
magnitude that interpreted OCAPI already gains over event - 
driven VHDL simulation. 

20 As an example, a 75Kgate design was found to run at 55 
cycles per second (on a HP/9000). This corresponds to M.l 
million" gates per second, and motivates why C++ is the way 
to go for system synthesis. 

25 Generating a compiled code simulator 

The compiled code generator is integrated into the "sysgen" 
object. There is one member function, "compiled ()" , that 
will generate this simulator for you. 

30 

#include ""qlib.h 1 ' 
#include "add.h' 1 
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void main() 
{ 



dfbfix il("il") 
dfbfix i2("i2") 
dfbfix ol("ol") 
add ADD ( "ADD" , il, i2, ol) ; 



10 



sysgen SI ("SI") ; 



SI « ADD. f sm() ; 



SI .compiled () ; 

} 

15 

In this simple example, a compiled code generator is made 
for a design containing only one FSM. The generator allows 
to include several fsm blocks, in addition to untimed 
blocks . 

20 

When this program is compiled and run, it leaves behind a 
file u Sl_ccs .cxx" , that contains the dedicated simulator. 
For the OCAPI user, the simulator defines one procedure, 
w one_cycle () " , that simulates one cycle of the system. 

25 

When calling this procedure, it also produces debugging 
ouput similar to the "setinf o (regcontents) " call for 
w ctlfsm" objects. This procedure must be linked to a main 
program that will execute the simulation. 

30 

If an untimed block is present in the system, then it will 
be included in the dedicated simulator. In order to declare 
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it, you must provide a member function "CCSdecl (of stream 
&) " that generates the required C++ declaration. As an 
example, the basic RAM block declares itself as follows: 

-- file: ram.h 

class ram : public base 

{ 

public : 

ram (char * name, 

FB& _address, 

FB& __data_in, 

FB& _data__out, 

FB& _w, 

FB& _r, 

int _size) ; 
void CCSdecl (of stream &os) ; 

private : 

}; 

-- file: ram.cxx 

void ram: : CCSdecl (of stream &os) 

{ 

os << " #include \ "ram.h\"\n n ; 
os << 11 ram " << typeName() << "("; 
os « \ << typeNameO « "\" , 
os « address .name () << "; 
os << data_in.name (} << H ; 
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os << data_out . name ( ) << 
os << w.name>() « "; 
os << r.nameO << ,r ; 
os << size << ");\n"; 

} 

This code enables the ram to reproduce the declaration by 
which it was originally constructed in the interpreted 
OCAPI program . Every untimed block that inherits from 
"base", and that you whish to include in the compiled code 
simulator must use a similar "CCSdecl" function. 

Compiling and running a compiled code simulator 

The compiled code simulator is compiled and linked in the 
same way as a normal OCAPI program. You must however also 
provide a "main" function that drives this simulator. 

The following code contains an example driver for the "add" 
compiled code simulator. 

#include "qlib.h" 

void one — cycle {}; 
extern FB il; 
extern FB i2; 
extern FB ol; 

void main() 
{ 

11 « dfix(l) « dfix(2) « dfix(3); 

12 « dfix(4) << dfix(5) << dfix(6); 
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one_cycle () ;> 
one__cycle () ; 
one_cycle ( ) ; 

while (ol .getSize () ) 

cout << ol.getO « "\n"; 

} 

When run, this program will produce the same results as 
before* In contrast to the compiled simulaton of your MPEG- 
4 image processor, you will not be able to notice any speed 
increase on this small example. 

Faster coinmunications 

OCAPI uses queues as a means to communicate during 
simulation. These queues however take up CPU power for 
queue management. To save this power, there is an 
additional queue type, "wireFB" , which is used for the 
simulation of point-to-point wiring connections. 

The dfbfix_wire class 

A "wireFB" does not move data. In contrast, it is related 
to a registered driver signal. At any time, the value read 
of this queue is the value defined by the registered 
signal. Because of this signal requirement, a "wireFB" 
cannot be used for untimed simulations. The following 
example of an accumulator shows how you can use a "wireFB", 
or the equivalent Mf bf ix_wire" . 
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#include "qlib.h" 
void main() 

{ 

5 elk ck; 

_sig a ("a" ,ck,dfix(0) ) ; 
_sig b("b fl ) ; 

10 dfbfix_wire A ("A", a); 

dfbfix B("B") ; 

sfg accu; 
accu. starts () ; 
a = a + b; 
accu << "accu" ; 
accu << ip (b, B) ; 
accu << op (a, A) ; 
accu • check ( ) ; 

B « dfix(l) « dfix(2) << dfix(3); 
while (B . getSize ( ) ) 

{ 

accu.eval (cout) ; 
accu. tick (ck) ; 

} 

} 

A "wireFB" is identical in use as a normal W FB" } . Only, for 
each "wireFB", you indicate a registered driver signal in 
30 the constructor. 



15 



20 



25 
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Interconnect strategies 

The "wireFB" object is related to the interconnect strategy 
that you use in your system. An interconnect strategy 
includes a decision on bus-switching, bus-storage, and bus- 
arbitration. OCAPI does not solve this problem for you: it 
depends on your application what the right interconnection 
strategy is. 

One default style of interconnection provided by OCAPI is 
the point-to-point, register driven bus scheme. This means 
that every bus carries only one signal from one processor 
to another. In addition, bus storage in included in the 
processor that drives the bus. 

More complex interconnect strategies, like the one used in 
Cathedral -2, are also possible, but will have to be 
described in OCAPI explicitly. Thus, the freedom of target 
architecture is not without cost. In the section "Meta-code 
generation", a solution to this specification problem is 
presented. 

Meta-code generation 

OCAPI internally uses meta-code generation. With this, it 
is meant that there are code generators that generate new 
w fsm", u sfg" and u sig" objects which in turn can be 
translated to synthesizable code. 

Meta-code generation is a powerful method to increase the 
abstraction level by which a specification can be made. 
This way, it is also possible to make parametrized 
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descriptions, eventually using conditions. Therefore, it is 
the key method of soft -chip components, which are software 
programs that translate themselves to a wide range of 
implementations, depending on the user requirements. 

The meta-code generation mechanism is also available to the 
user. To demonstrate this, a class will be presented that 
generates an ASIP datapath decoder. 

An ASIP datapath idiom 

An ASIP datapath, when described as a timed description 
within OCAPI, will consist of a number of signal flowgraphs 
and a finite state machine. The signal flowgraphs express 
the different functions to be executed by the datapath. The 
fsm description is a degenerated one, that will use one 
transition per decoded instruction. The transition 
condition is expressed by the "instruction" input, and 
selects the appropriate signal flowgraph for execution. 

Because the finite state machine has a fixed, but 
parametrizable structure, it is subject for meta-code 
generation. You can construct a "decoder" object, that 
generates the "fsm" for you. This will allow compact 
specification of the instruction set. 

First, the "decoder" object (which is present in OCAPI) 
itself is presented. 

the include file 



#define MAXINS 100 
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#include "qlib.h" 

class decoder : public base 
5 { 

public : 

decoder (char *__name, elk &ck, dfbfix &_insq) ; 

void dec (int _numinstr) ; 

ctlfsm &fsm() ; 
10 void dec(int __code, sfg &) ; 

void dec(int _code, sfg &, sfg &) ; 

void dec{int _code, sfg &, sfg &, sfg &) ; 
private : 

char *name; 
15 elk *ck; 

dfbfix *insq; 

int inswidth; 
int numinstr; 
20 int codes [MAXINS] ; 

ctlfsm _f sm; 
state active; 

25 sfg decode; 

_sigarray *ir; 

end * deccnd(int ) ; 
void decchk(int ); 

30 }; 

the , exx file 
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#include " decoder. h" 

static int numbits(int w) 
5 { 

int bits = 0; 
while (w) 

{ 

bits++; 
10 w = w » 1; 

} 

return bits; 

} 

15 int bitset(int bitnum, int n) 

{ 

return (n & (1 << bitnum) ) ; 

} 

decoder decoder (char *_name, elk &_ck, dfbfix &_insq) 
20 : base (_name) 

{ 

name = _name ; 

insq « _insq.asSource (this) ; 
ck = &__ck; 
25 numinstr = 0; 

inswidth = 0; 

_fsm << _jiame; 

// active << strapp (name, "__go_" ) ; 
30 active << "go"; 

_fsm << def It (active) ; 
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void decoder :: dec (int n) 
{ 

// define a decoder that decodes n instructions 

// instruction numbers are 0 to n-1 

// create also the instruction register 

if (! (n>0)) 

{ 

cerr << "*** ERROR: decoder " « name << » must 
have at least one instruction\n M ; 
exit (0) ; 

} 

inswidth = numbits (n-1) ; 

if (n > MAXINS) 

{ 

cerr << "*** ERROR: decoder " << name << " 
exceeds decoding capacity\n" ; 
exit (0) ; 

} 

dfix bit (0, 1, 0,df ix: :ns) ; 

ir = new _sigarray ( (char *) strapp (name , "_ir " ) , 
inswidth, ck, bit) ; 
decode . starts ( ) ; 
int i; 

SIGW(irw, dfix(0, inswidth, 0 f dfix::ns)); 

for (i=0; i<inswidth; i++) 

{ 

if (i) 

(*ir) [i] = cast (bit, irw » 

_sig (dfix (i, inswidth, 0, dfix: :ns) ) ) ; 
else 



Ill 

(*ir) [i] = cast (bit, irw) ; 

} 

decode << strapp ("decod" , name) ; 
decode << ip(irw, *insq) ; 



void decoder: :decchk (int n) 

{ 

// check if the decoder can decode this instruction 
int i; 

if (linswidth) 

{ 

cerr << »*** ERROR: decoder " << name << " must 
first define an instruction width\n u ; 
exit (0) ; 

} 

if (n > ( (1 « inswidth) -1) ) 
{ 

cerr << "*** ERROR: decoder " « name << " 
cannot decode code 11 << n << "\n";- 
exit (0) ; 

} 

for (i=0; i<numinstr; i++) 

{ 

if (n == codes [i] ) 

{ 

cerr « "*** ERROR: decoder " << name << " 
decodes code " << n << 11 twice\n" ; 
exit (0) ; 

} 

} 
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codes [numinstr] = n; 
numinstr++; 

} 



end * decoder : :deccnd(int n) 

{ 

// create the transition condition that corresponds 
//to the instruction number n 
int i ; 

end *cresult = 0; 
if (bitset(0, n) ) 

cresult = &_cnd( (*ir) [0] ) ; 
else 

cresult = &( !_cnd( (*ir) [0])); 
for (i « 1; i < inswidth; i++) 

{ 

if (bitset(i, n) ) 

cresult = &(*cresult && __cnd( (*ir) [i] ) ) ; 

else 

cresult = &(*cresult J_cnd ( (*ir) [i] ) ) ; 

} 

return cresult; 

} 

void decoder :: dec (int n, sfg &s) 

{ 

// enter an instruction that executes one sfg 
decchk (n) ; 

active << *deccnd(n) « decode << s « active; 

} 
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void decoder :: dec (int n, sfg &sl, sfg &s2) 
{ 

// enter an instruction that executes two sfgs 
decchk(n) ; 

active « *deccnd(n) << decode << si « s2 << 
active; 

} 

void decoder: : dec (int n, sfg &sl, sfg &s2, sfg &s3) 

{ 

// enter an instruction that executes three sfgs 
decchk(n) ; 

active << *deccnd(n) << decode << si << s2 << s3 << 
active; 

} 

ctlfsm & decoder :: fsm ( ) 

{ 

return _fsm; 

} 

The main principles of generation are the following. Each 
instruction for the ASIP decoder is defined as a number, in 
addition to one to three signal flowgraphs that need to be 
executed when this instruction is decoded. The "decoder" 
object keeps track of the instruction numbers already used 
and warns you if you introduce a duplicate. When the 
instruction number is unique, it is split up into a number 
of instruction bits, and a fsm transition condition is 
constructed from these bits. 

The ASIP datapath at work 
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The use of this object is quite simple. In a timed 
description were you want to use the decoder instead of a 
plain "fsm", you inherit from this decoder object rather 
then from the "base" class. Next, instead of the fsm 
description, you give the instruction list and the required 
signal flowgraphs to execute. 

As an example, an add/ subtract ASIP datapath is defined. We 
select addition with instruction number 0, and subtraction 
with instruction number 1. The following code (that also 
uses the supermacros) shows the specification. The 
inheritance to "decoder" also establishes the connection to 
the instruction queue. 

-- include file 
#ifndef ASIP__DP_H 
#define ASIP_DP_H 

class asip_dp : public decoder 

{ 

public: 

asip_dp (char *name, 
elk &ck, 
FB &ins, 
JPRT(inl) , 
_PRT(in2) , 
J?RT(ol)) ; 

private : 

PRT(inl) ; 
PRT(in2) ; 
PRT(ol ) ; 



10 



15 



115 

code file 
#include ""asip^dp.h 1 1 

dfix typ(0, 8, 0) ; 



asip_dp : : asip__dp 
elk &ck, 
FB &ins, 
_PRT(inl) , 
_PRT(in2) , 
PRT(ol) ) 



(char *name, 



: decoder (name, ck, ins), 
IS_SIG(inl, typ), 
IS_SIG(in2, typ) , 
IS_SIG(ol, typ) 



{ 



IS_IP(inl) ; 
IS_IP(in2) ; 
IS_OP(ol) ; 



20 



25 



30 



SFG(add) 
GET (inl) 
GET ( in2 ) 
ol = inl + in2; 
PUT(ol) ; 

SFG(sub) 
GET (inl) 
GET ( in2 ) 
ol = inl - in2; 
PUT(ol) ; 



dec (2); // decode two instructions 
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dec(0, SFGID(add)); 
dec(l, SFGID.(sub) ) ; 

} 

To conclude, one can note that meta-code generation allows 
reuse of design "idioms" (classes) rather then design 
w instances" (objects) . Intellectual -property code 

generators are a direct consequence of this. 

Description of a design of systems according to the method 
of the invention 

In the design of a telecommunication system 
(fig. 1A) , we distinguish four phases: link design, 
algorithm design, architecture design and circuit design. 
These phases are used to define and model the three key 
components of a communication system: a transmitter, a 
channel model, and a receiver. 

• The link design (1) is the requirement capture phase. 
Based on telecommunication properties such as 
transmission bandwidth, power, and data throughput (the 
link requirements) , the system design space is explored 
using small subsystem simulations. The design space 
includes all algorithms which can be used by a 
transmitter/ receiver pair to meet the link requirements. 
Out of receiver and transmitter algorithms with an 
identical functionality, those with minimal complexity 
are preferred. Besides this exploration, any expected 
transmission impairment must also be modeled into a 
software channel model . 
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• The algorithm design (2) » phase selects and interconnects 
the algorithms identified in the link design phase. The 
output is a software algorithmic description in C++ of 
digital transmitter and receiver parts in terms of 
floating point operations. To express parallelism in the 
transmitter and receiver algorithms, a data-flow data 
model is used. Also, the transmission imperfections 
introduced by analog parts such as the RF front -ends are 
annotated to the channel model. 

• The architecture design (3) refines the data model of the 
transmitter or receiver. The target architectural style 
is optimized for high speed execution, uses distributed 
control semantics and pipeline mechanisms. The resulting 
description is a fixed point, cycle true C++ description 
of the algorithms in terms of execution on bit-parallel 
operators. The architecture design is finished with a 
translation of this description to synthesizable VHDL. 

• Finally, circuit design (4) refines the bit-parallel 
implementation to circuit level, including technology 
binding, the introduction of test hardware, and design 
rule checks. 

Target Architecture 

The target architecture (5), shown in figure 2, consists of 
a network of interconnected application specific 
processors. Each processor is made up of bit-parallel data- 
paths. When hardware sharing is applied, also a local 
control component is needed to perform instruction 
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sequencing. The processors are obtained by behavioral 
synthesis tools or RT level synthesis tools. In either 
case, circuits with a low amount of hardware sharing are 
targeted. The network is steered by one or multiple clocks. 
Each clock signal defines a clock region. Inside a clock 
region the phase relations between all register clocks are 
manifest. Clock division circuits are used to derive the 
appropriate clock for each processor. 

In between each processor, a hardware queue is present to 
transport data signals. They increase parallelism inside a 
clock region and maintain consistency between different 
streams of data arriving at one processor. 

Across clock region boundaries, synchronization interfaces 
are used. These interfaces detect the presence of data at 
the clock region boundary and gate clock signals for the 
clock region that they feed. This way, non-manifest and 
variable data rates in between clock regions are supported. 

The ensemble of clock dividers and handshake circuits forms 
a parallel scheduler in hardware, synchronizing the 
processes running on the bit -parallel processor. 

Overview of the C++ modeling levels 

An overview of the distinct C++ modeling levels used by 
OCAPI is given in figure 3. The C++ modeling spans three 
subsequent levels in the design flow: the link level, the 
algorithm level and the architecture level. The transition 
to the last level, the circuit level, is made by automated 
means trough code generation. Usually, VHDL is used as the 
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design language in this lowest level. 

The link level is available through data-vector modeling. 
Using a design mechanism called parallelism scaling, this 
level is refined to the algorithm level. The algorithm 
level uses data-flow semantics. Using two distinct refining 
mechanisms in the data-flow level, we can refine this level 
to a register transfer level. 

The two refining mechanisms are clock cycle true modeling 
and fixed point modeling. Clock cycle true modeling is 
achieved by allocating cycle budgets and operators for each 
algorithm. To help the designer in this decision, operation 
profiling is foreseen. Fixed point modeling restricts the 
dynamic range of variables in the algorithms to a range for 
which a hardware operator can be devised. Signal statistics 
are returned by the design to help the designer with this. 

The last level, the architecture model, uses a signal 
flowgraph to provide a behavioral description. Using this 
description synthesizable code is generated. The resulting 
code then can be mapped onto gates using a register- 
transfer design tool such as DC of Synopsys. 

Data-vector modeling 

The upper level of representation of a communication system 
is the link level. It has the following properties : 

• It uses pure mathematical manipulation of functions. Time 
is explicitly manipulated and results in irregular- flow 
descriptions. 
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• It uses abstraction of all telecommunication aspects that 
are not relevant to the problem at hand. 

In this representation level, MATLAB is used for 
simulation. MATLAB uses the data-vector as the basic data 
object. To represent time functions in MATLAB, they are 
sampled at an appropriate rate. Time is present as one of 
the many vector dimensions. For example, the MATLAB vector 
addition 

a = b +c ; 

can mean both sequential addition in time (if the b and c 
vectors are thought of as time -sequential) , or parallel 
addition (if b and c happen to be defined at one moment in 
time) . MATLAB simply make no distinction between these two 
cases . 

Besides this time -space feature, MATLAB has a lot of other 
properties that makes it the tool-of -choice within this 
design level : 

• The ease with which irregular flow of data is expressed 
with vector operations. For example, the operation 
max(vector), or std(vector). 

• The flexibility of operations. A maximum operation on a 
vector of 10 elements or 1000 elements looks identically: 
max(vector). 

• The interactivity of the tool, and the transparency of 
data object management. 

• The extended library of operations, that allow very dense 
description of functionality. 

• Graphics and simulation speed. 

This data-vector restriction is to be refined to a data- 
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flow graph representation of the system. Definition of the 
data-flow graph requires .definition of all actors in the 
graph (actor contents as well as actor firing rules) and 
definition of the graph layout. 

In order to design systems effectively with the SOC++ 
design flow, a smooth transition between the data-vector 
level and the data-flow level is needed. A script to 
perform this task is constructed as can be seen in the 
following example. 

Example 1: design of a telecommunication system 
Initial data-vector description 

We consider a pseudonoise (PN) code correlator inside a 
direct sequence spread- spectrum (DS/SS) modem as an example 
(figure 4) . 



% input data 

in = [12 13 3 4 12] ; 



% spreading code 

C = [1 -1 1 -1] ; 

% correlate 

ot = corr (in, c) 



% find correlation peak 
[max, maxpos] = max (ot) ; 

A vector of input data in is defined containing 8 elements . 
These are subsequent samples taken from the chip 
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demodulator in the spread spectrum modem. The dimension of 
in thus corresponds to the time dimension. The input vector 
in is in principle infinite in length. For simulation 
purposes, it is restricted to a data set which has the same 
average properties (distribution) as the expected received 
data. 

The samples of in are correlated with the PN-code vector of 
length 4, c. The output vector ot thus contains 5 samples, 
corresponding to the five positions of in at which c can be 
aligned to. The max function locates the maximum value and 
position inside the correlated data. The position maxpos is 
subsequently used to synchronize the PN-code vector with 
the incoming data and thus is the desired output value of 
the algorithm. 

This code is an elegant and compact specification, yet it 
offers some open questions for the PN-correlator designer: 

• The algorithm has an implicit startup-effect. The first 
correlation value can only be evaluated after 4 input 
samples are available. From then on, each input sample 
yields an additional correlation value. 

• The algorithm misses the common algorithmic iteration 
found in digital signal processing applications: each 
statement is executed only once. 

• For the implementation, no statement is made regarding 
the available cycle budget. This is however an important 
specification for the attainable acquisition speed of the 
modem . 

All of these questions are caused by the parallelism of the 
data-vector description. 
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We now propose a way to make the parallelism of the 
operations more visible. Each of the MATLAB operations is 
easily interpreted. Inside the MATLAB simulation, the 
length of the operands will first be determined in order to 
5 select the correct operation behavior. For example, 

[max, maxpos] = max(ot) 

determines the maximum on a vector of length 5 (which is 
10 the length of the operand ot) . It needs at least 4 scalar 
comparisons to evaluate the result. If ot would for example 
have a longer length, more scalar comparisons would be 
needed. To indicate this in the description, we explicitly 
annotate each specific instance of the generic operations 
15 with the length of the input vectors. 



% input data 



in 



[1 2 1 3 3 4 1 2] 



8 



20 



% spreading code 



c = 



[1 -1 1 -1] 



4 



25 % correlate 



ot 



corr 



(in, c) 



5 



8,4 



% find correlation peak 
30 [max, maxpos] = max (ot) ; 

1 5 
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This little annotation helps us to see the complexity of 
the operations more clearly. We will use this when 
considering implementation of the description in hardware. 
It is of course not the intention to force a user to do 
this (MATLAB does this already for him/her) , 

When thinking about the implementation of this correlator, 
one can imagine different realizations each having a 
different amount of parallelism, that is, the mapping of 
all the operations inside Corr() and max() onto a time/ space 
axis. This is the topic of the next section. 

Scaled description 

Consider again the definition of the PN code, as in: 

% spreading code 
c = [1-11 -1] ; 
4 

This MATLAB description defines the variable c to be a 
data-vector containing 4 different values. This vector 
assignment corresponds to 4 concurrent scalar assignments. 
We therefore say that the maximal attainable parallelism in 
this statement is 4. 

In order to achieve this parallelism in the implementation, 
there must be hardware available to perform 4 concurrent 
scalar assignments. Since a scalar assignment in hardware 
corresponds to driving a data bus to a certain state, we 
need 4 busses in the maximal parallel implementation. If 
only one bus would be desired, then we would have to 
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indicate this. For each of the statements inside the MATLAB 
description, a similar story can be constructed. The 
indication of the amount of parallelism is an essential 
step in the transition from data- vectors to data-flow. We 
call this the scaling of parallelism. It involves a 
restriction of the unspecified communication bandwidth in 
the MATLAB description to a fixed number of communication 
busses. It is indicated as follows in the MATLAB 
description. 



% input data 

in = [12 13 3 4 12]; 
8@1 

% spreading code 
c = [1-11 -1] ; 
4@4 

% correlate 

ot s corr (in, c) 

5@1 8,4 

% find correlation peak 

[max, maxpos] = max (ot) ; 

1@1 5 

As is seen, each assignment is extended with a @i 
annotation, that indicates how the parallelism in the data 
vectors is ordened onto a time axis. For example, the 8 
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input values inside in are provided sequentially by writing 
8®1. The 4 values of c on the other hand, are provided 
concurrently. We see that, whatever implementation of the 
corr operation we might use, at least 8 iterations will be 
required, simply to provide the data to the operation. 

At this moment, the description is getting closer to the 
data-flow level, that uses explicit iteration. One more 
step is required to get to the data flow graph level. This 
is the topic of the next section. 

Data flow graph definition 

In order to obtain a graph, the actors and edges inside 
this graph must be defined. Inside the annotated MATLAB 
description, data precedences are already present through 
the presence of the names of the vectors. The only thing 
that is missing is the definition of actor boundaries; 
edges will then be defined automatically by the data 
precedences going across the actor boundaries. 

This can be done by a new annotation to the MATLAB 
description. Three actors will be defined in the DS/SS 
correlator. 

actorl { 

% input data 

in = [1 2 1 3 3 4 1 2]; 
8@1 

} 

actor 2 { 
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% spreading code 

c = [1-11-1]; 

4@4 
% correlate 

ot = corr (in, c) 

5@1 8 , 4 

} 

actor 3 { 

% find correlation peak 

[max, maxpos] = max (ot) ; 

1@1 5 

} 

Again the annotation should be seen as purely conceptual; 
it is not intended for the user to write this code. Given 
these annotations, a data flow graph can be extracted from 
the scaled MATLAB description in an unambiguous way. 

• actorl is an actor with no input, and one output, called 
in. 

• actor2 is an actor with 1 input in and one output ot. 

• actor3 is an actor with 1 input ot and outputs maxpos and 
max. 

Furthermore, the simulation uses queues to transport 
signals in between the actors. We need three queues, called 
in, ot and maxpos. 

The missing piece of information for simulation of this 
dataflow graph are the firing rules (or equivalently the 
definition of productions and consumptions on each edge) . A 
naive data flow model is shown in figure 4 : actorl (10) 
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produces 8 values, which are correlated by actor2 (11) , 
while the maximum is selected inside actor3 (12) . 



This would however mask the parallelism scaling operation 
5 inside the MATLAB description. For example, it was chosen 
to provide the 8 values of the in vector in a sequential 
way over a parallel bus. It is believed that the multi-rate 
SDF model therefore is not a good container for the 
annotated MATLAB description. 

10 

Another approach is a cyclostatic description. In this case 
we have a graph as in figure 5 . 

We see that the determination of production patterns 
involves examining the latencies of operations internal to 
15 the actor. This increases the complexity of the design 
script. It is simpler to perform a demand driven scheduling 
of all actors. The firing rule only has to examine the 
availability of input tokens. 

20 The desired dataflow format as in figure 6 is thus situated 
in between the multirate SDF level and the cyclostatic SDF 
level. It is proposed to annotate consumptions and 
productions in the same way as it was written down in the 
matlab description: 

25 • 8@1 is the production of actorl. It means: 8 samples are 
produced one at a time. 

• 8@1 and 5@1 is the consumption and production of actor2 
respectively. 

• 5@1 and 1@1, 1@1 are the consumption and productions for 
30 actor3. 

Data-flow simulation 
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Given an annotated matlab description, a simulation can now 
be constructed by writing . a high-level model for each 
actor, interconnecting these with queues and constructing a 
system schedule. OCAPI provides both a static scheduler and 
a demand-driven scheduler. 

Out of this simulation, several statistics are gathered: 

• On each queue, put and get counts are observed, as well 
as signal statistics (minimum and maximum values) . The 
signal statistics provide an idea of the required 
buswidths of communication busses. 

• The scheduler counts the firings per actor, and operation 
executions (+, *, ...) per actor. This profiling helps 
the designer in deciding cycle budgets and hardware 
operator allocation for each actor. 

These statistics are gathered through a C++ operator 
overloading mechanism, so the designer gets them for free 
if he uses the appropriate C++ objects (schedule, queue and 
token class types) for simulation. 

We are next interested in the detailed clock-cycle true 
behavior of the actors and the required storage and 
handshake protocol circuits on the communication busses. 
This is the topic of the next step, the actor definition. 

Actor definition 

The actor definition is based on two elements: 

• Signal -flowgraph representation of behavior. 

• Time-verification of the system. 
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The two problems can be solved independently using the 
annotated MATLAB code as specification. In OCAPI : 

• The actor RT modeling proceeds in C++ and can be freely- 
intermixed with high level descriptions regarding both 
operator wordlength effects and clock-cycle true timing . 

• The time-verification approach allows the system 
feasibility to be checked at all times by warning the 
designer for deadlock and/or causality violations of the 
communication. 

Signal flowgraph definition 

Within the OCAPI design flow, a class library was developed 
to simulate behavior at RT-level. It allows 

• To express the behavior of an algorithm with arbitrary 
implementation parallelism by setting up an signal flow 
graph (SFG) data structure. 

• To simulate the behavior of an actor at a clock- cycle 
true level by interpreting this SFG data structure with 
instantiated token values. 

• To specify wordlength characteristics of operations 
regarding sign, overflow and rounding behavior. Through 
explicit modeling of the quantization characteristic 
rather than the bit -vector representation (as in SPW) , 
efficient simulation runtimes are obtained. 

• To generate C++ code for this actor, and hence perform 
the clock cycle true simulation with compiled code. 

• To generate VHDL code for this actor, and synthesize an 
implementation with Synopsys DC. 

To generate DSFG code for this actor, and synthesize an 
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implementation with Cathedral-3. It was observed that 
Cathedral-3 performs a better job with relation to both 
critical path and area of the obtained circuits than 
Synopsys DC. The best synthesis results are obtained by 
5 first using Cathedral-3 to generate a circuit at gate 
level and then Synopsys-DC to perform additional logic 
optimization as a postprocessing. 

An important observation was made regarding simulation 
10 speed. For equivalent descriptions at different 
granularities, the following relative runtimes were found: 

• 1 for the MATLAB simulation. 

• 2 for the untimed, high level C++ data flow description. 

• 4 for the timed, fixed point C++ description (compiled 
15 code) . 

• 40 for the procedural, word- level VHDL description. 

It is thus concluded that RT-modeling of systems within 
OCAPI is possible within half an order of magnitude of the 
highest level of description. VHDL modeling however, is 
much slower. Currently the figure of 40 times MATLAB is 
even considered an under-estimate. Future clock- cycle based 
VHDL simulators can only solve half of this problem, since 
they still use bit-vector based simulation of tokens rather 
then quantization based simulation. 

Next, the modeling issues in C++ are shown in more detail. 
The C++ signal -flowgraph representation uses a signal data- 
type, that can be either a registered or else an immediate 
30 value. With this data-type, expressions are formed using 



20 



25 
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the conventional scalar operations. ( + , -, *, shifts and 
logical operations) . Expressions are grouped together in a 
signal flowgraph. A signal flowgraph interfaces with the 
system through the data-flow simulation queues. Several 
5 signal -flowgraphs can be grouped together to a SFG- 
sequence. A SFG sequence is an expression of behavior that 
spans several cycles. The specification is done through a 
finite state machine model, for which transition conditions 
can be expressed. The concept of SFG modeling is pictured 
10 in figure 7. 

The combination of different SFG ! s in combination with a 
finite state machine make up the clock-cycle true actor 
model. Within the actor, SFG communication proceeds through 
15 registered signals. Communication over the boundaries of an 
actor proceeds through simulation queues. 

When the actor is specified in this way, and all signal 
wordlengths are annotated to the description, an automated 

20 path to synthesis is available. Several different SFG's can 
be assigned to one datapath. Synthesizable code is 
generated in such a way that hardware sharing between 
different sfg's is possible. A finite state machine (FSM ) 
description is first translated to SFG format to generate 

25 synthesizable code in the same way. There is an implicit 
hierarchy available with this method: by assigning 
different FSM-SFG's to one datapath, an overall processor 
architecture is obtained that again has a mode port and 
therefore looks like a (multicycle) datapath. For macro 

30 control problems (such as acquisition/tracking algorithm 
switching in modems) , this is a necessity. 
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Although the distance between the annotated MATLAB level 
and this RT-level SFG seems large, it is reasonable on the 
actor level . Consider for example 



actor3 { 

% find correlation peak 
[max, maxpos] = max (ot) ; 
1@1 5 

} 



We are asked here to write time the max() operation with an 
SFG. actor2 has scaled the parallelism of ot to 5@1. 
A solution is presented in actual C++ code. 



{ 

FB qin ( 1 ' qin' ' ) ; 
FB qlout { * ' qout " ) ; 
FB q2out( w qout") ; 
FB start ( x 'start") ; 



//input queue 
//output queue 
//output queue 
//the start pin of the 
processor 



clock ck ; 

_sig currmax(ck,df ix(0) ) 

_sig maxpos (ck,df ix(0) ) ; 

__sig currpos (ck,df ix(0) ) ; 
_sig input value ; 
_sig maxout ; 
_sig maxposout ; 
_sig one(dfix{l)) ; 



//registry holding current 

maximum 
//registry holding position 

of max 
// current position 
//holds input values 

//a constant 
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SFG sfgO, sfgl,sfg2 ; 



i //we use 3 sfg's 



sf gO. starts () ; //code after this is for sfgO 

5 currmax = inputvalue ; 
maxpos = one ; 
currpos = one ; 

//next, give sfgO a mode and 
an input queue 
10 sfgO «' 'm0"«ip (inputvalue, qin) ; 

sfgl. starts () ; //code after this is for sfgl 

//this is a conditional 
assignment 

15 currmax= ( inputvalue>currmax) . cassign ( inputvalue, currmax) ; 
maxpos = (inputvalue > currmax) .cassign (currpos, maxpos) ; 
currpos = currpos + 1 ; 
sfgl «' 'ml" «ip( inputvalue, qin) ; 

20 sfg2 .starts () ; //the last SFG 

maxposout= (inputvalue>currmax) . cassign (_sig (dfix (4) ) , maxpos) ; 
maxout= ( inputvalue >currmax) .cassign (inputvalue, currmax) ; 
sfg2 <<''m2''<< op (maxout, qout) « op(maxposout,q2out) ; 



25 state s0( w s0") , sl( w sl"), s2( w s2"), s3( x# s3") ; 

sO >> I end (start) » sO 

sO » end (start) » sfgO si 

si >> allways >> sfgl >> s2 

s2 » allways » sfgl >> s3 

30 s3 >> allways >> sfg2 >> sO 

} 
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As an aid to interpret the C++ code, the equivalent 
behavior is shown in figure. 8. The behavior is modeled as a 
4 -cycle description. Three SFG's (13,14,15) are needed, in 
addition to a 4 -state controller (16) . The controller is 
modeled as a Mealy machine. 

The C++ description also illustrates some of the main 
contributions of OCAPI : register- transfer level aspects 
(signals, clocks, registers), as well as dataflow aspects 
simulation queues) are freely intermixed and used as 
appropriate. By making use of C++ operator overloading and 
classes, these different design concepts are represented in 
a compact syntax format. Compactness is a major design 
issue. 

Having this specification, we have all information to 
proceed with the detailed architectural design of the 
actor. This is however only part of the system design 
solution: we are also interested in how to incorporate the 
cycle-true result in the overall system. 

Time verification 

The introduction of time (clock cycles) in the simulation 
uses an expectation-based approach. It allows to use either 
a high level or else an SFG-type description of the actor, 
and simulate the complete system clock-cycle true. The 
simulation helps the designer in finding whether his 'high- 
level 1 description matches the SFG description, and 
secondly, whether the system is realizable. 



A summary of the expectation based simulation is given in 
figure 10 and is used to illustrate the ideas mentioned 
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below. 

This is a different approach then when analysis is used 
(e.g. the evaluation of a compile-time schedule and token 
lifetimes) to force restrictions onto the actor 
implementation. This traditional approach gives the 
designer no clue on whether he is actually writing down a 
reasonable description. 

Each token in the simulation is annotated with a time when 
it is created: the token age. Initial tokens are born at 
age 0, and grow older as they proceed through the dataflow 
graph. The unit of time is the clock cycle. 

Additionally, each queue in the simulation holds a queue 
age (say, 'the present 1 ) that is used to check the 
causality of the simulation: a token entering a queue 
should not be younger than this boundary. A queue is only 
able to delay tokens (registers) , and therefore can only 
work with tokens that are older than the queue age. 

If such a consistency violation is detected, a warning 
message is issued and the token age is adapted to that of 
the queue. Otherwise, the time boundary of the queue is 
updated with the token age after the token is installed on 
the queue. 

The queue age is steered by the actor that drives it. For 
each actor the designer formulates an iteration time. The 
iteration time corresponds the cycle budget that the 
designer expects to need for the detailed actor 
description. Upon each actor firing, the queues driven by 
the actor are aged with the iteration time. 
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At the same time, the actor operations also increase the 
age of the tokens they proaess. For normal operations, the 
resulting token age is equal to the maximum of the operand 
token ages. For registered signals (only present in SFG- 
level actor descriptions) , the token age is increased by 
one. Besides aging by operation, aging inside of the queues 
is also possible by attaching a travel delay to each queue. 

Like the high-level actor description, a queue is also 
annotated with a number of expectations. These annotations 
reflect what the implementation of the queue as a set of 
communication busses should look like. 

A communication bus contains one or more registers to 
provide intermediate storage, and optionally also a 
handshake -protocol circuit. A queue then maps to one or 
more (for parallel communication) of these communication 
busses . 

The expectations for a simulation queue are : 

• The token concurrency, that expresses how many tokens of 
the same age can be present on one queue. To communicate 
a MATLAB vector annotated with 8@2 for example requires 
two communication busses. This is reflected in the high 
level queue model by setting the token concurrency to 
two. 

• In case the token concurrency is 1, it can be required 
that subsequent tokfens are separated by a determined 
number of clock cycles. In combination with the travel 
delay, this determines how many registers are needed on a 
communication bus. This expectation is called the token 
latency. 
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Example implementations for different expectations are 
shown in figure 9. 

5 When the token concurrency is different from one, the token 
latency cannot be bigger than one. If it would, then the 
actor that provides the tokens can be designed more 
effectively using hardware sharing, and thus reducing the 
token concurrency. 
10 A summary of the expectation based simulation is put as 
follows. First, there are several implicit adaptations to 
token ages and queue ages. 

• An actor description increases the queue age upon each 
actor iteration with the iteration time. 

15 • A queue increases the age of communicated tokens with the 
travel delay. 

• An SFG description increases token ages through the 
operations. The token age after a register is increased 
by one, all other operations generate a token with age 

20 equal to the maximum of the operand ages. 

The set of operations that modify the token age are 
referred to as token aging rules. 

25 Next, a number of checks are active to verify the 
consistency of the simulation. 

• A token age cannot be younger (smaller) then a queue age. 

• The token concurrency on a queue cannot be exceeded. 
3 0 • The token latency on a queue cannot be exceeded. 
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A successful clock-cycle true simulation should never fail 
any of these checks. In the case of such success, the 
expectations on the queue can be investigated more closely 
to devise a communication bus for it. In this description 
5 we did not mention the use of handshake protocol circuits. 
A handshake protocol circuit can be used to synchronize 
tokens of different age at the input of an actor. 

Implementation 

10 

The current library of OCAPI allows to describe a system in 
C++ by building on a set of basic classes. 

• A simulation queue class that transports a token class 
15 and -allows to perform expectat ion- checks . 

• An SFG/FSM class that allows clock cycle true 
specification, simulation and code generation. 

• A token class that allows to simulate both floating 
point -type representation and fixed point type 

2 0 representat ion . 

One can simulate the MATLAB data-vector data-type with C++ 

simulation queues. For the common MATLAB operations, one 

can develop a library of SFG descriptions that reflect 
25 different flavors of parallelism. For instance, a C++ 

version of the description 

% input data 

in « [1 2 1 3 3 4 1 2] ; 

% spreading code 
30 c » [1 -1 1 -1] ; 

% correlate 

ot = corr (in, c) 



140 

% find correlation peak 
[max, maxpos] = max (ot) ; * 

looks, after scaling of the parallelism and defining the 
actor boundaries, like 
5 FB in, ot, maxp ; 

in. delay (1, 0) ; //iteration time, travel delay 

ot. delay (1,0) ; 
maxp. delay (4, 0) ; 

10 

in. expect (1, 1) ; 

ot .expect (1, 1) ; 
maxp. expect (1,4) ; 

15 

in = vector(l, 2, 1, 3, 3, 4, 1, 2) ; 

ot = corr(8, 4, in, vector (1, -1, 1, -1)) 

maxp = maxpos (4, ot) ; 

20 This C++ description contains all information necessary to 
simulate the system in mind at clock cycle true level and 
to generate the synthesizable code for the system and the 
individual actors . 

25 Thus, the data-flow level has become transparent - it is 
not explicitly seen by the designer but rather it is 
implied through the expectations (pragma's) and the 
library, 

30 Example 2: design of a 4-tap correlator processor 



//travel time, concurrency, 
latency 



An example of processor design is given next to experience 
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hardware design when using OCAPI . 

The task is to design a 4 -tap correlator processor that 
evaluates a correlation value each two cycles* One 
coefficient of the correlation pattern needs to be 
5 programmable and needs to be read in after a control signal 
is asserted. The listing in figure 11 gives the complete 
FSMD model of this processor. 

The top of the listing shows how types are declared in 
OCAPI. For example, the type T_sample is 8 bits wide and 
10 has 6 bits beyond the binary point. 

For such a type declaration, a signed, wrap-around and 
truncating representation is assumed by default. This can 
be easily changed, as for instance in 

15 // floating point 
dfix T_sample ; 

//unsigned 

dfix T_sample(8, 6, ns) ; 

20 

//unsigned, rounding 

dfix Trample (8 # 6, ns, rd) ; 

Below the type declarations we see coefficient 
25 declarations. These are specified as plain double types, 
since they will be automatically quantized when read in 
into the coefficient registers. It is possible to intermix 
existing C/C++ constructs and types with new ones. 
Following the coefficients, the FSMD definition of the 
30 correlator processor is shown. This definition requires: 
the specification of the instruction set that is processed 
by this processor, and a specification of the control 
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behavior of the processor. For each of these, OCAPI uses 
dedicated objects. 

First, the instruction set is defined. Each instruction 
performs data processing on signals, which must be defined 
5 first. The definitions include plain signals (sample_in and 
corr_out), registers (accu) , and register arrays (coef[] 
and sample [] ) . 

Next, each of the instructions are defined. A definition is 
started by creating a SFG object. All signal expressions 
10 that come after such an SFG definition are considered to 
make up part of it. A SFG definition is closed simply by 
defining a new SFG object. 

The first instruction, initializecoef s , initializes the 
coefficient registers coef[]. The for loop allows to 
15 express the initialization in a compact way. Thus, the 
initialize_coefs instruction is also equivalent to 



coef[0] = W(T_coef, hardwired_coef [0] ) 

coef[l] = W(T_coef, hardwired coef [1] ) 

20 coef[2] - W(T_coef, hardwired_coef [2] ) 

coef[3] = W(T_coef, hardwired coef[3]) 



The second instruction programs the value of the first 
coefficient. The new value, coef_in, is read from an input 

25 port of the FSMD with the same name. Beyond this port, we 
are -outside' of the timed FSMD description and use 
dataflow semantics, and communicate via queues. 
The third and fourth instruction, correl_l and correl_2 
describe the two phases of the correlation. It is very easy 

30 to express complex expressions just by using C++ operators. 
Also, a cast operation is included that limits the 
precision of the intermediate expression result. Although 
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this is for minor importance for simulation, it has strong 
influence on the hardware synthesis result. 

The instruction readsample shifts the data delay line. In 
addition to a for loop, an if expression is used to express 
the boundary value for the delay line. Use of simple C++ 
constructs such as these allow to express signal flow graph 
structure in a compact an elegant way. It is especially 
useful in parametric design. 

The last instruction, readcontrol , reads in the control 
value that will decide whether the first correlation 
coefficient needs to be refreshed. 

Below all SFG definitions, the control behavior of the 
correlator processor is described. An FSM with tree states 
is defined, using one initial state rst, and two normal 
states phase_l and phase_2 . Next, four transitions are 
defined between those three states. Each transition 
specifies a start state, the transition condition, a set of 
instructions to execute, and a target state. For a designer 
used to finite state machine specification, this is a very 
compact and efficient notation. 

The transition condition always is always true, while a 
transition condition like end (load) will be true whenever 
the register load contains a one. 

The resulting fsm description is returned to OCAPI by the 
last return statement. The simulator and code generator can 
now process the object hierarchy in order to perform 
semantical checks, simulation, and code generation. 
The translation to synthesizable VHDL and Cathedral -3 code 
is automatic and needs no extra designer effort. The 
resulting circuit for datapath and controller is shown in 
figure 12. The hierarchy of the generated code that is 
provided by OCAPI is also indicated. Each controller and 
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datapath are interlinked using a link cell. The link cell 
itself can be embedded into an automatically generated 
testbench or also in the system link cell that 
interconnects all components. 

Example 3; design of Complex High Speed ASICs 

The design of a 75 Kgate DECT transceiver is used as 
another example (figure 13) . 

The design consists of a digital radiolink transceiver 
ASIC, residing in a DECT base station (20) (figure 13) . The 
chip processes DECT burst signals, received through a radio 
frequency front-end RF (21) . The signals are equalized (22) 
to remove the multipath distortions introduced in the radio 
link. Next, they are passed to a wire-link driver DR (23), 
that establishes communication with the base station 
controller BSC (24). The system is also controlled locally 
by means of a control component CTL (25) . 

The specifications that come with the design of the digital 
transceiver ASIC in this system are as follows: 

• The equalization involves complex signal processing, and 
is described and verified inside a high level design 
environment such as MATLAB. 

• The interfacing towards the control component CTL and the 
wire-link driver DR on the other hand is described as a 
detailed clock-cycle true protocol. 

• The allowed processing latency is, due to the real time 
operation requirements, very low: a delay of only 29 DECT 
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symbols (25.2 /zseconds) is allowed. The complexity of the 
equalization algorithm, on the other hand, requires up to 
152 data multiplies per DECT symbol to be performed. This 
implies the use of parallel data processing, and 
introduces a severe control problem. 
• The scheduled design time to arrive from the 
heterogeneous set of specifications to the verified gate 
level netlist, is 18 person-weeks. 

The most important degree of freedom in this design process 
is the target architecture, which must be chosen such that 
the requirements are met. Due to the critical design time, 
a maximum of control over the design process is required. 
To achieve this, a programming approach to implementation 
is used, in which the system is modelled in C++. The object 
oriented features of this language allows to mix high-level 
descriptions of undesigned components with detailed clock- 
cycle true, bit-true descriptions. In addition, appropriate 
object modelling allows the detailed descriptions * to be 
translated to synthesizable HDL automatically. Finally, 
verification testbenches can be generated automatically in 
correspondence with the C++ simulation. 

The result of this design effort is a 75 Kgate chip with a 
VLIW architecture, including 22 datapaths, each decoding 
between 2 and 57 instructions, and including 7 RAM cells. 
The chip has a 194 die area in 0.7 CMOS technology. 

The C++ programming environment allows to obtain results 
faster then existing approaches. Related to register 
transfer design environments such as , it will be shown 
that C++ allows to obtain more compact, and consequently 
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less error prone descriptions of hardware. High level 
synthesis environments could solve this problem but have to 
fix the target architecture on beforehand. As will be 
described in the case of the DECT transceiver design, 
5 sudden changes in target architecture can occur due to hard 
initial requirements, that can be verified only at system 
implementation . 

First, the system machine model is introduced This model 
includes two types of description: high-level untimed ones 
and detailed timed blocks. Using such a model, a simulation 
mechanism is constructed. It will be shown that the 
proposed approach outperforms current synthesis 
environments in code size and simulation speed. Following 
this, HDL code generation issues and hardware synthesis 
strategies are described. 

System Machine Model 

20 Due to the high data processing parallelism, the DECT 
transceiver is best described with a set of concurrent 
processes. Each process translates to one component in the 
final system implementation. 

25 At the system level, processes execute using data flow 
simulation semantics. That is, a process is described as an 
iterative behavior, where inputs are read in at the start 
of an iteration, and outputs are produced at the end. 
Process execution can start as soon as the required input 

3 0 values are available. 

Inside of each process, two types of description are 
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possible. The first one is a high level description, and 
can be expressed using procedural C++ constructs. A firing 
rule is also added to allow dataflow simulation . 

The second flavour of processes is described at register 
transfer level . These processes operate synchronously to 
the system clock. One iteration of such a process 
corresponds to one clock cycle of processing. 

For system simulation, two schedulers are available. A 
dataflow scheduler is used to simulate a system that 
contains only untimed blocks. This scheduler repeatedly 
checks process firing rules, selecting processes for 
execution as their inputs are available. 

When the system also contains timed blocks, a cycle 
scheduler is used instead. The cycle scheduler manages to 
interleave execution of multi-cycle descriptions, but can 
incorporate untimed blocks as well. 

Figure 14 shows the front-end processing of the DECT 
transceiver, and the difference between data-flow and cycle 
scheduling. At the top, the front-end processing is seen. 
The received signals are sampled by and A/D, and correlated 
with a unique header pattern in the header correlator HCOR. 
The resulting correlations are detected inside a* header 
detector block HDET. A simulation with high level 
descriptions uses the dataflow scheduler. An example 
dataflow schedule is seen in the middle of the figure. The 
A/D high level description produces 3 tokens, which are put 
onto the interconnect communication queue. Next, the 
correlator high level description can be fired three times, 
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followed by the detector processing. 

t 

When a cycle true description of the A/D and header 
correlator on the other hand is available, this system can 
5 be simulated with the cycle scheduler as shown on the 
bottom of the figure. This time, behavior of the A/D block 
and correlator block are interleaved. As shown for the HCOR 
block, executions can take multiple cycles to perform. The 
remaining high level block, the detector, contains a firing 
10 rule and is executed as required. Related to the global 
clock grid, it appears as a combinatorial function. 

Detailed process descriptions reflect the hardware behavior 
of a component at the same level of the implementation. To 
15 gain simulation performance and coding effort, several 
abstractions are made. 



Finite Wordlength effects are simulated with a C++ fixed 
point library. It has been shown that the simulation of 
20 these effects is easy in C++ . Also, the simulation of the 
quantization rather than the bitvector representation 
allows significant simulation speedups . 

The behavior is modelled with a mixed control /data 
processing description, under the form of a finite state 

25 machine coupled to a datapath. This model is common in the 
synthesis community. In high throughput telecommunications 
circuits such as the ones in the DECT transceiver ASIC, it 
most often occurs that the desired component architecture 
is known before the hardware description is made. The FSMD 

30 model works well for these type of components. 



The two aspects, wordlength modelling and cycle true 
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modelling, are available in the programming environment as 
separate class hierarchies. Therefore, fixed point 
modelling can be applied equally well to high level 
descriptions . 

As an illustration of cycle true modelling, a part of the 
central VLIW controller description for the DECT 
transceiver ASIC is shown in figure 15. The top shows a 
Mealy type finite state machine (30) . As actions, the 
signal flowgraph descriptions (31) below it are executed. 
The two states execute and hold correspond to operational 
and idle states of the DECT system respectively. The 
conditions are stored in registers inside the signal 
flowgraphs. In this case, the condition holdrequest is 
related to an external pin. 

In execute state, instructions are distributed to the 
datapaths. Instructions are retrieved out of a lookup 
table, addressed by a program counter. When holdrequest is 
asserted, the current instruction is delayed for execution, 
and the program counter PC is stored in an internal 
register. During a hold, a nop instruction is distributed 
to the datapaths to freeze the datapath state. As soon as 
holdrequest is removed, the stored program counter holdpc 
addresses the lookup table, and the interrupted instruction 
is issued to the datapaths for execution. 

Signals and Signal Flow Graphs 

Signals are the information carriers used in construction 
of a timed description. Signals are simulated using C++ sig 
objects. These are either plain signals or else registered 
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signals. In the latter case the signals have a current 
value and next value, which is accessed at signal reference 
and assignment respectively. Registered signals are related 
to a clock object elk that controls signal update. Both 
types of signals can be either floating point values or 
else simulated fixed point values. 

Using operations, signals are assembled to expressions. By 
using the overloading mechanism as shown in figure 16, the 
parser of the C++ compiler is reused to construct the 
signal flowgraph data structure. 

An example of this is shown in figure 17. The top of the 
figure shows a C++ fragment (40) . Executing this yields the 
data structure (41) shown below it. It is seen that 

• the signal flowgraph consists both of user defined nodes 
and operation nodes. Operation nodes keep track of their 
operands through pointers. The user defined signals are 
atomic and have null operand pointers. 

• The assignment operations use reversed pointers allowing 
to find the start of the expression tree that defines a 
signal . 

A set of sig expressions can be assembled in a signal flow 
graph (SFG) . In addition, the desired inputs and outputs of 
the signal flowgraph have to be indicated. This allows to 
do semantical checks such as dangling input and dead code 
detection, which warn the user of code inconsistency. 

An SFG has well defined simulation semantics and represents 
one clock cycle of behavior. 
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Finite State Machines 

1 

After all instructions are described as SFG objects, the 
control behavior of the component has to be described. We 
5 use a Mealy- type FSM model to do this. 

Again, the use of C++ objects allow to obtain very compact 
and efficient descriptions. Figure 18 shows a graphical and 
C++- textual description of the same FSM. The correspondence 

10 is obvious. To describe an equivalent FSM in an event 
driven HDL, one usually has to follow the HDL simulator 
semantics, and for example use multi-process modelling. By 
using C++ on the other hand, the semantics can be adapted 
depending on the type of object processed, all within the 

15 same piece of source code. 

Architectural Freedom 

An important property of the combined control/data model is 
20 the architectural freedom it offers. As an example, the 
final system architecture of the DECT transceiver is shown 
in figure 19. It consists of a central (VLIW) controller 
(50) , a program counter controller (51) and 22 datapath 
blocks. Each of these are modelled with the combined 
25 control/data processing shown above. They exchange data 
signals that, depending on the particular block, are 
interpreted as instructions, conditions or signal values. 
By means of these interconnected FSMD machines, a more 
complex machine is constructed. 

30 

It is now motivated why this architectural freedom is 
necessary. For the DECT transceiver, there is a severe 
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latency requirement. Originally, a dataflow target 
architecture was chosen (fdgure 20) , which is common for 
this type of telecommunications signal processing. In such 
an architecture, the individual components are controlled 
5 locally and data driven. For example, the header detector 
processor signals a DECT header start (a correlation 
maximum) , as soon as it is sure that a global maximum is 
reached . 

Because of the latency requirement however, extra delay in 
10 this component cannot be allowed, and it must signal the 
first available correlation maximum as a valid DECT header. 
In case a new and better maximum arrives, the header 
detector block must then raise an exception to subsequent 
blocks to indicate that processing should be restarted. 
15 Such an exception has global impact. In a data driven 
architecture however, such global exceptions are very 
difficult to implement. This is far more easy in a central 
control architecture, where it will take the form of a jump 
in the instruction ROM. Because of these difficulties, the 
20 target architecture was changed from data driven to central 
control. The FSMD machine model allowed to reuse the 
datapath descriptions and only required the control 
descriptions to be reworked. This architectural change was 
done during the 18-week design cycle. 

25 

The Cycle Scheduler 

Whenever a timed description is to be simulated, a cycle 
scheduler is used instead of a dataflow scheduler. The 
30 cycle scheduler creates the illusion of concurrency between 
components on a clock cycle basis. 
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The operation of the cycle scheduler is best illustrated 
with an example. In figure -21, the simulation of one cycle 
in a system with three components is shown. The first two, 
components 1 (60) and 2 (61) , are timed descriptions 
constructed using fsm and sfg objects. Component 3 (62) on 
the other hand is decribed at high level using a firing 
rule and a behavior. In the DECT transceiver, such a loop 
of detailed (timed) and high level (untimed) components 
occurs for instance in the RAM cells that are attached to 
the datapaths. In that case, the RAM cells are described at 
high level while the datapaths are described at clock cycle 
true level . 

The simulation of one clock cycle is done in three phases. 
Traditional RT simulation uses only two; the first being an 
evaluation phase, and the second being a register update 
phase . 

The three phases used by the cycle scheduler are a token 
production phase, an evaluation phase and a register update 
phase . 

The three-phase simulation mechanism is needed to avoid 
apparent deadlocks that might exist at the system level. 
Indeed, in the example there is a circular dependency in 
between components 1, 2, and 3, and a dataflow scheduler 
can no longer select which of the three components should 
be executed first. In dataflow simulation, this is solved 
by introducing initial tokens on the data dependencies. 
Doing so would however require us to devise a buffer 
implementation for the system interconnect, and introduce 
an extra code generator in the system. 
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The cycle scheduler avoids this by creating the required 
initial tokens in the token production phase. Each of the 
phases operates as follows. 

[0] Each the start of clock cycle, the sfg descriptions to 
be executed in the current clock cycle are selected. In 
each fsm description, a transition is selected, and the 
sfg related to this transition are marked for execution. 

[1] Token production phase. For each marked sfg, look into 
the dependency graph, and identify the outputs that 
solely depend on registered signals and/or constant 
signals. Evaluate these outputs and put the obtained 
tokens onto the system interconnect. 

[2] (a) Evaluation phase (case a) . In the second phase, 
schedule marked sfg and untimed blocks for execution 
until all marked sfg have fired. Output tokens are 
produced if they are directly dependent on input tokens 
for timed sfg descriptions, or else if they are outputs 
of untimed blocks. 

[2] (b) Evaluation phase (case b) . Outputs that are however 
only dependent on registered signals or constants will 
not be produced in the evaluation phase. 

[3] Register update phase. For all registered signals in 
marked sfg, copy the next-value to the current -value . 

The evaluation phase of the three-phase simulation is an 
iterative process. If a pre-set amount of iterations have 
passed, and there are still unfired components, then the 
system is declared to be deadlocked. This way, the cycle 
scheduler identifies combinatorial loops in the system. 

Code Generation and Simulation Strategy 
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The clock-cycle true, bit -true description of system 
components serves a dual purpose. First, the descriptions 
have to be simulated in order to validate them. Next, the 
descriptions have also to be translated to an equivalent, 
synthesizable HDL description. 

In view of these requirements, the C++ description itself 
can be treated in two ways in the programming environment. 
In case of a compiled code approach, the C++ description is 
translated to directly executable code. In case of an 
interpreted approach, the C++ description is preprocessed 
by the design system and stored as a data structure in 
memory . 

Both approaches have different advantages and uses. For 
simulation, execution speed is of primary importance. 
Therefore, compiled code simulation is needed. On the other 
hand, HDL code generation requires the C++ description to 
be available as a data structure that can be processed by a 
code generator. Therefore, a code generator requires an 
interpreted approach. 

We solve this dual goal by using a strategy as shown in 
figure 22. The clock-cycle true and bit-true description of 
the system is compiled and executed. The description uses 
C++ objects such as signals and finite state machine 
descriptions which translate themselves to a control/data 
flow data structure. 

This data structure can next be interpreted by a simulator 
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for quick verification purposes. The same data structure is 
also processed by a code generator to yield two different 
descriptions . 

5 A C++ description can be regenerated to yield an 
application-specific and optimized compiled code simulator. 
This simulator is used for extensive verification of the 
design because of the efficient simulation runtimes. 
A synthesizable HDL description can also be generated to 
10 arrive at a gate -level implementation. 

The simulation performance difference between these three 
formats (interpreted C++ objects, compiled C++, and HDL) is 
illustrated in table 1. Simulation results are shown for 
15 the DECT header correlator processor, and also the complete 
DECT transceiver ASIC. 

The C++ modelling gains a factor of 5 in code size (for the 
interpreted- object approach) over RT-VHDL modeling. This is 
20 an important advantage given the short design cycle for the 
system. Compiled code C++ on the other hand provides faster 
simulation and smaller process size then RT-VHDL. 

For reference, results of netlist-level VHDL and Verilog 
25 simulations are given. 



Design 


Size 
(Gates) 


Type 


Source 
Code 

(# lines) 


Simulation 

Speed 

(cycles/s) 


Process 

Size 

(Mb) 


HCOR 


6K 


C++ (interpreted 
obj) 


230 


69 


3 . 8 
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C++ (compiled) 


1700 


819 


2.7 


VHDL (RT) 


1600 


251 


11.9 


VHDL (Netlist) 


77000 


2 . 7 


81.5 


DECT 


75K 


C++ (interpreted 
obj) 


8000 


2.9 


20 


C++ (compiled) 


26000 


60 


5.1 


Verilog 
(Netlist) 


59000 


18.3 


100 



Table 1. 



Synthesis Strategy 

5 Finally, the synthesis approach that was used for the DECT 
transceiver is documented. As shown in figure ID, the 
clock-cycle true, bit -true C++ description can be 
translated from within the programming environment into 
equivalent HDL. 

10 

For each component, a controller description and a datapath 
description is generated, in correspondence with the C++ 
description. This is done because we rely on separate 
synthesis tools for both parts, each one optimized towards 
15 controller or else datapath synthesis tasks. 

For datapath synthesis, we rely on the Cathedral -3 back-end 
datapath synthesis tools , that allow to obtain a 
bitparallel hardware implementation starting from a set of 
20 signal flowgraphs. These tools allow operator sharing at 
word level, and result in run times less than 15 minutes 
even for the most complex, 57-instruction data path of the 
DECT transceiver. 
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Controller synthesis on the other hand is done by logic 
synthesis such as Synopsys DC. For pure logic synthesis 
such as FSM synthesis, this tool produces efficient 
results. The combined netlists of datapath and controller 
5 are also post-optimized by Synopsys DC to perform gate- 
level netlist optimizations. This divide and conquer 
strategy towards synthesis allows each tool to be applied 
at the right place. 

10 During system simulation, the system stimuli are also 
translated into testbenches that allow to verify the 
synthesis result of each component. After interconnecting 
all synthesized components into the system netlist, the 
final implementation can also be verified using a generated 

15 system testbench. 



Example 4: design of a QAM transmission system with OCAPI 
(figure 23) 

A QAM transmission system, that includes a transmitter, a 
channel model, and a receiver is designed. 



System Specification 



A system specification in OCAPI is an executable model: an 
25 executable file, that can be run as a software program on a 
computer. The principle of executable specification, as it 
is called, is very important for system design, it allows 
one to check your specification using simulations. In this 
case, we are designing a QAM transmission system. A full 
30 communications system contains a transmitter, a channel 
model, and a receiver. The ensemble of the transmitter, 
channel model and receiver organized as an executable 
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specification is also called an end-to-end executable 
specification. The term end-to-end clearly indicates that 
the simulation starts with a user message, and ends with a 
(received) user message. In between, the complete digital 
5 transmission is modeled, as shown in figure 23. 

In this text, the complete transmission system will be 
developed. The development of a component in such a system 
is never a one-shot process. Rather, development proceeds 
through a design flow: a collection of subsequent design 
10 levels connected by 'natural' design tasks. For a modem, 
the typical design levels are: 

- a statistical level, to do high level explorations of 
algorithms. In OCAPI, this level is called the link 
level . 

15 - a functional level, to assemble selected algorithms into 
a single operational modem. In OCAPI, this level is 
called the algorithm level. 

- a structural level, to represent the modem as a machine 
that executes a functional description. In OCAPI, this 

20 level is called the architecture level. Each of these 
levels has an own set of requirements. Statistical 
requirements can be for example a bit error rate or a 
cell loss ratio. Functional requirements are for 
instance the set of modulation schemes to support. 

25 Finally, structural requirements are requirements like 
type of interfaces, or preselected architectures. 

Arranging the requirements besides the design levels yields 
the design flow, as shown in figure IB. The dashed box 
30 contains the levels that will be coded in C++-OCAPI. The 
upper level (the statistical one) is described in a 
language like Matlab. It is not included in this text: We 



will start from a complete 
functional specification is* 

Design Flow In OCAPI-C++ 

5 Overall Design Flow 
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functional specification. The 
given herebelow in part A. 



A design flow with OCAPI looks, from a high level point 
of view, as shown in figure 1C. The initial 
specification is an architecture model, constructed in 
10 C++. Through the use of refinement, we will construct 

an architecture model out of it. Next, relying on code 
generation, we obtain a synthesizable architecture 
model. This model can be converted to a technology- 
mapped architecture in terms of gates. OCAPI is 
15 concerned with the C++ layers of this flow, an in 

addition takes care of code generation issues. 

Algorithmic Models 

The algorithmic models in OCAPI use the dataflow 
computational model . The construction of this code by 
O 20 small examples selected out of Part B (below) is 

discussed. 

First, we consider the construction of an actor. An 
actor is a subalgorithm out of a dataflow system model. 
In OCAPI, each actor is defined by one class. As an 

25 example of actor definition, we take the diffenc block 

out of the transmitter. The include file (3.3) defines 
a class diffenc (line 10) that inherits from a base 
class. This inheritance defines the class under 
definition as a dataflow actor. The dataflow actor 

30 defines a constructor, a run method and a reset method. 

The run method (line 25) is the method that is called 
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when the actor should be executed. This method takes 
along parameters that , include the name (name), the I/O 
ports (__sym 1, _symb2)and other attributes (_qpsk, 
_diff_mode) . The type FB (Flow-Buffer) is the type of a 
5 FIFO queue. Looking at the implementation of run {??, 

line 26), we distinguish a firing rule in lines 29-30. 
The getSizeO method of a queue returns the number of 
elements in that queue. The firing rule expresses that 
the run() method should return whenever there is no 
10 data available in the queue. Otherwise, processing 

continues as described beyond line 32 (This processing 
is the implementation of the spec as described in Part 
A. 

A dataflow system is constructed out of such actorjs. 

15 The system code in 5.3 shows how the diffenc actor is 

instantiated (lines 57-61) . Besides actors, the system 
code also creates interconnect queues (lines 42-48) . By 
giving these as parameters in the constructor of 
actors, the required communication links are 

20 established. Besides the interconnection of actors, 

the system code also needs to create a scheduler. This 
scheduler will repeatedly test firing rules in the 
actors (by calling their run() method) . The system 
scheduler that steers the differential encoder is shown 

25 on line 77 of 5.3. After this object is created, all 

dataflow actors that should be under control of it are 
"shifted into" it. The scheduler object has a method, 
run(), that tries firing all of the actors associated 
with the schedule just once. 

30 Architecture Models 
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An architecture model expresses the behavior of the 
algorithmic model in terms of operations onto hardware. 
The kind of hardware features that affect this depend 
of course on the target architectural style. OCAPI is 
intended for a bit -parallel , synchronous style. For 
this kind of style, two kinds of refinements are 
necessary: First, the data types need to be expressed 
in terms of fixed point numbers. Second, the execution 
needs to proceed in terms of clock cycles. The first 
kind of refinement is called fixed point modeling. The 
second kind is called cycle true modeling. These two 
refinements can be done in any order; for a complete 
architecture model, both are needed. We first give an 
example on how fixed point numbers are expressed in 
C++. Consider the ad block of the transmitter (3.2, 
line 24-27) . The purpose of this block is to introduce 
a quantization effect, such as for instance would be 
encountered when the signal passes through an analog- 
digital or digital-analog converter. In this case, the 
high level algorithmic model is constructed with a 
fixed point number in order to perform this 
quantization. On line 32, an object of type dfix 
(called indfix) is created. This object represents a 
fixed point value. The constructor uses three 
parameters. The first, '0', provides an initial value. 
The following two (w and L) are parameters that 
represent the wordlength and fractional wordlength 
respectively. The operation of the ad block is as 
follows. When there is information in the input queue, 
the value read is assigned to the fixed point number 
indfix. At the moment of assignment, quantization 
happens, whether or not the input value was a floating 
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point value (The FIFO buffers are actually passing 
along objects of type dfix, so that floating as well as 
fixed point numbers can be passed from one block to the 
other) . A next example will show how cycle true 
5 modeling is done. We consider the derandomizer function 

of the receiver (6.4), First, looking at the 
algorithmic model (line 6 9-102) , we see that the block 
reads two inputs (byte_in and syncro) and writes one 
output (byte_out) . In between, it performs some 

10 algorithmic processing (line 89-97) . The architecture 

model is shown in the define () function starting at 
line 116. The first few lines are type definitions and 
signal declarations. Next, four instructions are 
defined (line 143-179) , and a controller which 

15 sequences these instructions is specified (line 184- 

195) . The architecture model makes heavily use of 
macros to ease the job of writing code. All of these 
are explained above. The goal of the define () function 
is to define an object hierarchy consisting of signals, 

20 expressions, states, etc ...that represents the cycle 

true behavior of a processor. At the top of the 
hierarchy is a finite state machine object. The member 
function fsm() (line 106) returns this object (which is 
a data member of the derandomizer class) . The system 

25 integration of the derandomizer (5.3, line 169-176) is 

the same for the algorithmic and architecture model. 
The selection between algorithmic and architecture 
model is done by giving a system scheduler either a 
base object (as in line 186) or else the fsm object for 

30 simulation (as in line 206) . Remember that the 

algorithmic model derives creates a class that derives 
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from the base object; while an architecture model 
defines a finite state machine object. 

Code Generation 

Finally we indicate the output of the code generation 
process. When an architecture model is constructed, 
several code generators can be used. OCAPI currently 
can generate RT-VHDL code directly, or else also 
Cathedral-3 dsfg code. When the member function 
generate () of a system scheduler is called, Cathedral-3 
code will be produced, along with the required system 
link cells. The member function vhdlook() on the other 
hand produces RT-VHDL code. In this example, we have 
used the vhdlook{) method (5.2, line 401). We consider 
the derandomizer block in the receiver. The first place 
where this appears in the generated code is in the 
system netlist (6.13, line 70 and line 143). Next, we 
can find the definitions of the block itself: its 
entity declaration (6.14), the RTL code (6.15), and a 
mapping cell from the fixed-point VHDL type FX to the 
more common VHDL type std_logic (6.16). By using this 
last mapping cell, we can also hook up the VHDL code 
for derand in a generated testbench (6.17). This 
testbench driver reads stimuli recorded during the C++ 
simulation and feeds them into the VHDL simulation. 

Part At System S pecification 

System Contents 

The end- to -end model of the QAM transmission system 
under consideration is shown in figure 23. It consists 
of four main components: 
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- A byte generator GEN (2 01) 

- A transmitter TX. (203) 

- A channel model CHAN. (205) 

- A receiver RX. (207) 

5 

The byte generator generates a sequence of random bytes. 
These are modulated inside of the transmitter to a QAM 
signal. The channel model next introduces distortions in 
the signal, similar to those occurring in a real channel. 
10 Finally, the receiver demodulates the signal, returning a 
decoded byte sequence. If no bit errors occur, then this 
sequence should be the same as the one created by the byte 
generator . 

Next, the detailed operation of the transmitter, the 
15 channel and the receiver is discussed. For the internal 
construction of a component, one might however still refer 
to figure 24. 
Transmitter Specification 

20 The Transmitter includes 

- rnd: A randomizer, which transforms a byte sequence into 
a pseudorandom byte sequence. This is done because of 
the more regular spectral properties of a rando mi zed 
(or 'whitened 1 ) byte sequence. 

25 - tuple: A tuplelizer, which chops the transmitted bytes 
into QAM/QPSK symbols. 

- diffenc: A differential encoder which applies 
differential encoding to the symbols. 

- map: A QAM symbol mapper, which translates QAM symbols 
30 to I/Q pulse sequence s. 
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- shape: A pulse shaper, which transforms the pulse 
sequences to a continous wave. In digital 
implementation, the temporal Continuity' is achieved by 
applying oversampling. 
5 - da: Finally, there is a block which applies quantization 
to the signal. This block simulates the effect of a 
digital-to-analog converter. 

The transmitter reads in a byte sequence, and randomizes 
10 this with a pseudorandom byte sequence. The sequence 
contains a synchronization word to align the receiver 
derandomizer to the transmitter randomizer. The 
pseudorandom sequence is generated by exoring a bitstream 
with a bitstream produced by a linear feedback shift 
15 register (LF SR) . The LFSR produces a bitstream according 

to the polynomial g(x) = 1 + x 5 + x 6 . It next feeds the 
bytes to a tuplelizer that generates symbols out of the 
byte sequence according to the following scheme. 
Given bits b7 b6 b5 b4 b3 b2 bl bO, 

20 
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QAM16 


QPSK 


hi 


I symbol 0 


I [1] symbol 0 


b6 


Q symbol 0 


I [0] symbol 0 


b5 


I symbol 1 


Q[l] symbol 0 


b4 


Q symbol 1 


Q[0] symbol 0 


b3 


I symbol 2 


I[l] symbol 1 


b2 


Q symbol 2 


I [0] symbol 1 


bl 


I symbol 3 


Q[l] symbol 1 


bO 


Q symbol 3 


Q[0] symbol 1 
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The symbols values are next fed to the differential encoder 
that generates a dif f encoded symbol sequence : 
i=(((~(a * b)) & (a " glblstate) ) | ( (a * b) & (b A 
glbQstate) ) ) &1; 
5 q=(((~(a " b)) & (b " glbQstate) ) | ( (a " b) & (a " 
glblstate))) &1; 

with i and q the output msbs of the differentially encoded 
symbol; glblstate, glbQstate the previous values of i and 
q; and a and b the inputs msbs of the input symbol. The 
10 Isbs are left untouched (only for qaml6) The differentially 
encoded symbol sequence is next mapped to the actual symbol 
value using the following constellation for QPSK. 
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+3 
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-3 


3 


1 



15 For QAM16, the following constellation will be used 
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15 
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After mapping, the resulting complex sequence is pulse 
shaped. A RRC shaping filter with oversampling n = 4 is 
20 taken, with the rolloff factor set at r = 0.3. After pulse 
shaping, the sequence is upconverted to fc = fs/4 in the 
multiplexer block (included in the shaper) 
Channel Model Specification 
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The Channel Model contains 

- FIR filter with programmable taps. The filter is used 
to simulate linear distortions such as multipath 
effects . 

5 - Noise injection block. The incoming signal is fed into a 
20 tap filter. The second, third, fourth and 21th tap of 
the filter are programmable. Next a noise signal is 
added to the sequence. The noise distribution is 
gaussian; 

10 XI = sqrt (-21n* (Ul) ) * cos(2*pi*U2) 

X2 = sqrt (-21n* (Ul) ) * sin(2*pi*U2) 

Ul, U2 are independent and uniform [0,1], 
XI and X2 are independent and N(0,1) 

15 

Receiver Specification 
The Receiver includes 

• lmsff A feed forward, T/4 spaced LMS Equalizer. 

20 • demap A demapper, translating a complex signal 

back to a QAM symbol. 

• de tuple A detupler, glueing individual symbols back 
to bytes. 

• derand A derandomizer , translating the pseudonoise 
25 sequence back to an unrandomized sequence. 

It is not difficult to see that this signal processing 
corresponds to the reverse processing that was applied at 
the transmitter. The incoming signal is fed into an 
30 equalizer block. The 4 tap oversampled FF equalizer is 
initialized with a downconverting RRC sequence. This way, 
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the equalizer will act at the same time as a matched 
filter, a symbol timing recovery loop, a phase recovery 
loop, and an intersymbol- interference removing device. It 
is a simple solution at the physical synchronization 
5 problem in QAM. 

The equalizer is initialized as follows. Given the complex 
RRC 
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then the LMS should be initialized with 
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The coefficient adaption algorithm of the equalizer is of 
the Least Mean Square type. This algorithm is decision 
directed; such algorithms are able to do tracking in a 

15 synchronization loop, but not to do acquisition 
(initialization) of the same loops. For simplicity in this 
example, we will however make abstraction of this 
acquisition problem. Next, the inverse operations of the 
transmitter are performed: the demodulated complex signal 

20 is converter to a QAM symbol in the demapper. The resulting 
QAM symbol stream is differentially decoded and assembled 
to a byte sequence in the detupler. The differential 
decoding proceeds according to 

a=(((~(i A q)) & (i * glblstate) ) | ( (i A q) & (q A 
25 glbQstate))) &i ; b= ( ( (~ (i A q) ) & ( q - glbQstate) ) | ( (i A q ) 
& (i A glblstate) ) ) &i ; 
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Finally, the pseudorandom encoding of the sequence is 
removed in the derandomizer . 

Part B: C++ code of the QAM system 

3 Transmitter Code 
3.1 tx/ad.h 

1 // ad.h 

2 // All rights reserved -- Imec 1998 

3 // @(#)ad.hl.2 03/20/98 
4 

5#infdef AD_H 
6#define AD_H 
7 

8 # include "qlib.h" 
9 

10 class ad : public base{ 

11 FB *in; 

12 FB *ot; 

13 double* W; 

14 double*L; ; 
15 

16 public: 

17 ad (char *name, FB & _in,FB & _ot, doubled _w, double 
&_L) ; 

18 int run() ; 

19 int reset () ; 

20 }; 
21 

22#endif 
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3.2 tx/ad.cxx 

1 // ad.cxx 

2 // All rights reserved -- Imec 

3 // @(#)ad.cxx 1.4 03/31/98 
4 

5#include "ad.h" 
6 

7 ad: :ad(char*name, 

8 FB & __in, 

9 FB & _ot, 

10 double & _W, 

11 double & _L) : base (name) 

12 { 

13 in = _in. asSource (this) ; 

14 0 t = _ot .asSink(this) ; 

15 W &_W; 

16 L &_L; 

17 } 
18 

19 int ad: : reset () { 

20 //return to initial state 

21 return 1; 

22 } 
23 

24 intad: :run() { 
25 

26 //firing rule 

27 if (in->getSize () < 1) { 

28 return 0; 

29 } 



172 

30 

31 //core functionality 

32 dfix indfix(0, (int) (*W) , (int) (*L) ) ; 

33 indfix= in->get( ) ; // inputting* quantization 
5 assignment 

34 ot->put (indf ix) ; // outputing 
35 

36 return 1; 

37 } 
10 38 

3.3 tx/diffenc.h 

1 // diffenc.h 
15 2 // All rights reserved -- Imec 1998 
3 // @(#)diffenc.h 1.7 98/03/31 
4 

5#infdef DIFFENC_H 
6 #de f ine DIFFENC_H 
20 7 

8#include "qlib.h" 
9 

10 class diffenc: public base{ 
11 

25 12 FB *symbl; 

13 FB *symb2; 

14 double *qpsk; 

15 double *diff __mode; 

16 int iState; 
30 17 int qState; 

18 

19 public: 
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20 diff enc (char *name, 

21 FB & __symbl, 

22 FB & _symb2, 

23 double &_qpsk, 

5 24 double &_dif fjnode) ; 

25 int run() ; 

26 int reset () ; 

27 }; 
28 

10 29#endif 

3.4 tx/dif fenc.cxx 

1 // dif fenc.cxx 
15 2 // All rights reserved Imec 1998 

3 // @(#) dif fenc.cxx 1.8 98/03/31 
4 

5#include "diff enc. h" 
6 

20 7 diffenc: :dif f enc (char*name, 

8 FB & _symbl, 

9 FB & _symb2 , : 

10 double & _qpsk, 

11 double &_diff _mode) : base (name) 
25 12 { 

13 symbl = _symbl . asSource (this) ; 

14 symb2 = _symb2 . asSink (this) ; 

15 qpsk = &_qpsk; 

16 diff _mode= &_diff _mode; 
30 17 reset ()/ 

18 } 
19 
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2 0 int dif fenc :: reset () { 

21 iState^ 0; 

22 qState= 0; 

23 return 1; 
5 24 } 

25 

26 int dif fenc :: run () { 
27 

28 //firing rule 

10 29 if (symbl->getSize () < 1) 

30 return 0; 
31 

32 //core func 

33 intsymb = (int) Val (symbl->get ( ) ) ; 
15 34 

35 if ( (int) *diff _mode) { 

36 int a = ((int)*qpsk) ? (symb>> 1) & 1 : (symb>> 3) & 
1 ; 

// get msb ! s only 

20 37 int b = ((int)*qpsk) ? (symb>> 0) & 1 : (symb>> 2) & 

1 ; 
38 

39 int i = {(("(a A b)) & (a^iState) ) | (a ( A b) 

&b (^qState) ) ) &1; // encodemsb 
25 40 int q = ((("(a A b)) & (b A qState) ) | (a ( A b) 

&a (^iState) ) ) &1; 
41 

42 iState= i; 

43 qState= q; 
30 44 

45 symb = ( (int) *qpsk) ? (i<< l)|q : (i<< 3) [ (q<< 

2) | (symb& 3) ; 
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46 } 
47 

48 symb2->put (symb) ; 

49 return 1; 
5 50 } 

51 



3 . 5 tx/map . h 



10 



15 



1 


// 


2 


// 


3 


// 


4 


// 


5 


// 


6 


// 


7 


// 


8 


// 


9 


// 


10 


// 


11 


// 


12 


// 


13 


// 


14 


// 


back 



20 11 // MAP 



Mapping of QAM16 constellations to symbols and 



25 15 // 

16 // Author: 

17 // Patrick Schaumont 

18 // 

19 

30 20#infdef MAP_H 

21#define MAP_H 
22 
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2 3 # i nc lude M ql ib . h " 
24 

25 classmap : public base{ 

26 double *qpsk; 
5 27 

28 FB * sin; 

29 FB * qOut; 

30 FB * iOut; 
31 

10 32 dfix immediateQ(df ix v) ; 
33 dfix immediatel (dfix v) ; 
34 

35 public: 

36 map (char *name, FB& _sIn,FB & _iOut, FB& _qOut, double 
15 &_qpsk) ; 

37 int run () ; 
38 

39 }; 
40 

20 41#endif 

3 . 6 tx/map . cxx 



1 // 

25 2 // COPYRIGHT 

3 // 

4 // 

5 // Copyright 1996 IMEC, Leuven, Belgium 



6 



// 



30 



7 



// 



Allrights reserved. 



8 



// 



9 



// 
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10 



/ / Module : 



11 



// 



MAP 



12 



// 



14 



13 




Mapping of QAM16 constellationsto symbolsand back 



15 



// 



16 



// Author: 



17 



// 



Patrick Schaumont 



18 // 

19 
20 

21#include "map.h" 
22 

23 // # # ## ##### 

24 //###### # # 

25 // #### # # # # 

26 // # # ###### ##### 

27 // # # # # # 

28 // # # # # # 
29 

30 

31 // QAM16 

32 static double vQMapl6[]={ 

33 ( 0.0), 

34 (+1 .0), (+1.0), (+3.0), (+3.0), 

35 (-1 .0), (-3.0), (-1.0), (-3.0), 

36 (+1 .0), (+3.0), (+1.0), (+3.0), 

37 (-1 .0), (-3.0), (-1.0), (-3.0) 
38 

39 

40 static double vIMapl6[] = { 

41 ( 0.0) , 
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42 (+1 .0), (+3.0), (+1.0), (+3.0), 

43 ( + 1 .0), ( + 1.0), (+3.0),. (+3.0), 

44 (-1 .0), (-1.0), (-3.0), (-3.0), 

45 (-1 .0) , (-1.0) , (-3.0) , (-3.0) 
5 46 }; 

47 

48 // QPSK 

49 static double vQMap4[]={ 

50 ( 0.0) , 

10 51 (+3 .0), (-3.0), (+3.0), (-3.0), 

52 }; 

53 static double vIMap4 [] = { 

54 ( 0.0) , 

55 (+3 .0), (+3.0), (-3.0), (-3.0), 
15 56 }; 

57 

58 map: : map (char *name, FB& _sIn,FB & _iOut, FB& 
_qOut,double& _qpsk) : base (name) { 

59 sin = & _sln; 
20 60 qOut = & _qOutj 

61 iOut= & _iOut; 

62 qpsk= & _qpsk; 

63 } 
64 

25 65 df ix map: : immediateQ (df ixv) { 

66 if ( (int) *qpsk) { 

67 return df ix (vQMap4 [ (int) Val (v+1) ] ) 

68 } else{ 

69 return df ix (vQMapl6 [ (int) Val (v+1) ] ) 
30 70 } 

71 } 
72 
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73 dfix map: : immediatel (df ixv) { 

74 if ( (int) *qpsk) { 

75 return df ix (vIMap4 [ (int) Val (v+1) ] ) ; 

76 } else{ 

5 77 return df ix(vIMapl6 [ (int) Val (v+1) ] ) ; 

78 } 

79 } 
80 

8 1 intmap : : run ( ) { 
10 82 if (sIn->getSize() < 1) 

83 return 0; 

84 dfix v = sln->get(); 

85 *iOut « immediatel (v) ; 

86 *qOut << immediateQ(v) ; 
15 87 return 1; 

88 } 
89 

3.7 tx/rnd.h 

20 

1 // rnd.h 

2 // All rights reserved -~ Imecl998 

3 // @(#)rnd.h 1.5 03/31/98 
4 

25 5#infdef RND_H 
6#define RND_H 
7 

8#include "qlib.h" 
9 

30 10#define SYNCPERIOD 54 
ll#define SYNCWORD1 0x00 
12#define SYNCWORD2 0x55 
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13#define SYNCWORD3 0x00 



14#define SYNCWORD4 0x55 



15 



16 



class rnd : public base{ 



5 



17 



FB 



* input ; 



18 



FB 



* output ; 



19 



int 



synccntr ; 



20 

21 public: 

10 22 rnd (char *name, FB& input, FB& _output) ; 

23 int run() ; 

24 int reset () ; 

25 }; 
26 

15 27#endif 

3 . 8 tx/rnd . cxx 

1 // rnd. cxx 
20 2 // All rights reserved -- Imec 1998 
3 // @(#)rnd.cxx 1.6 03/20/98 
4 

5 # include "rnd.h" 
6 

25 7 int glbRandom = 1; 
8 

9 int glbRandState; 
10 

11 rnd: :rnd(char *name, 
30 12 FB & _input, 

13 FB & ^output) :base(name) 



14 { 
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15 input = __input.asSource(this) ; 

16 outputs __output .asSink(this) ; 

17 synccntr= 0; 

18 reset () ; 
5 19 } 

20 
21 

22#define BIT(k, n) ( (k» (n-1)) & 1) 
23#define MASK (k, n) (k & ( (1« (n+l))-l)) 
10 24 

25 int randbitO { 

26 int r; 
27 

28 r= BIT(glbRandState / 5) A BIT (glbRandState , 6 ); 
15 29 glbRandState= MASK(r | (glbRandState« 1) , 6); 
30 

31 if (glbRandom) 

32 return r; 

33 else 

20 34 return 0; 
35 } 
36 
37 



38 // 



=*««==«= M „ =MMMMMMM=sa= zr^^^MEMBER 

25 FUNCTIONS 
39 

40 int rnd: : reset () { 

41 //return to initial state 

42 glbRandState= (1« 7) -1; 
30 43 return 1; 

44 } 
45 
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46 int md : ; run ( ) { 
//firing rule 
if (input ->getSize () < 1) { 
return 0 ; 

} 



//core func 
int i; 

int outbyte = 0; 

int inbyte « (int) Val (input ->get ( ) ) ; 
for (i=7; i>=0; i--) { 

outbyte= (outbyte«l) j (randbit ( ) A (inbyte»i & 

} 

synccntr++; 

if (synccntr == SYNCPERIOD) { 

// cerr << »*** INFO: randomizer sends SYN\n" ; 
output ->put (outbyte) ; 
output ->put (SYNCW0RD1) 
output - >put ( S YNCWORD2 ) 
output ->put (SYNCW0RD3) 
output - >put ( S YNCW0RD4 ) 
synccntr= 0; 
reset () ; 

70 } 

71 else { 

72 output ->put (outbyte) ; 

73 } 
30 74 return 1; 

75 } 
76 





47 




48 




49 


5 


50 




51 




52 




53 




54 


10 


55 




56 




57 




58 




1) ) 


15 


59 




60 




61 




62 




63 


20 


64 




65 




66 




67 




68 


25 


69 



o 
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77 

3.9 tx/ shape. h 

5 1 // shape. h 

2 // All rights reserved Imec 1998 

3 // @(#)shape.h 1.3 03/18/98 
4 

5#infdef SHAPE_H 
10 6#define SHAPE_H 
7 

8#include "qlib.h" 
9 

10#define MAXLEN 33 
15 11 

12 class shape : public base{ 

13 FB * i_in; 

14 FB * q_in; 

15 FB * s_ out; 

20 16 double c [MAXLEN] ; // RC coefficients 
17 

18 public: 

19 shape (char *name, FB& _i_ in, FB& __q_in, FB& _s_out) ; 

20 int run() ; 

25 21 int run_old(); 

22 int reset () ; 

23 void makecoef fs () ; 

24 }; 
25 

30 26#endif 



3.10 tx/ shape. cxx 
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1 / / shape . cxx 

2 // All rights reserved --. Imec 1998 

3 // @(#) shape. cxx 1,7 06/26/98 
4 

5 5#include " shape. h" 
6 

7 shape: : shape (char *name, 

8 FB & _i_in, 

9 FB & _q_J.ii, 

10 10 FB & _s_out) : base (name) 

11 { 

12 i_in = _i_in . asSource (this) ; 

13 q__in = _q__in. asSource (this) ; 

14 s__out = __s_out .asSink(this) ; 

15 15 makecoeffs( ) ;//RRC coeff generation 
16 reset () ; 

IV } 
18 

19 int shape: : reset () { 
20 20 //return to initial state 

21 while (i_in->getSize () >0) 

22 i__in->pop () ; 

23 while (q_in->getSize () >0) 

24 q_in->pop() ; 
25 25 

26 return 1; 

27 } 
28 

29 void shape: :makecoeffs () { 
30 30 c[0] = 2.725985e-02; 

31 c[l] = 2.079339e-01; 

32 c[2] = 4 .002601e-01; 
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33 


c[3] = 5 


.241213e- 


01; 


34 


c[4] = 5 


.241213e- 


01; 


35 


c[5] =4 


.002601e- 


01; 


36 


c[6] = 2 


.079339e- 


01; 


37 


c[7] =2 


.725985e- 


02; 


38 


} 






39 








40 


int shape : 


: run { ) { 




41 


int i ,j 


} 





10 42 #define NF 8 

43 #define SPS 4 
44 

45 static double deli [NF] ; 

46 static double delqfNF] ; 
15 47 

48 if ( (i_in->getSize() <1) | j 

49 (q_in->getSize () <1) ) { 

50 return 0 ; 

51 } 
20 52 



53 for (j =1; j <= SPS ; j++) { 
54 

55 for (i = NF-1; i>= 1; i--) { 

56 deliti] = deli[i-l] ; 
25 57 delqfi] = delq[i-l] ; 

58 } 

59 if(j == 1) { 

60 deli[0] = Val (i_in->get ( ) ) ; 

61 delq[0] = Val (q_in->get ( ) ) ; 
30 62 } 

63 else{ 

64 deli[0] =0; 
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65 delq[0] = 0; 

66 } 
67 

68 double acci = 0; 
5 69 double accq = 0; 

70 for{i =0; i < NF; i++) { 

71 acci += deli[i]*c[i] ; 

72 accq += delq[i]*c[i] ; 

73 } 
10 74 

75 switch (j) { 

76 case 1: s_out->put (acci) /break; 

77 case 2: s_out->put (-accq) /break/ 

78 case 3: s_out->put (-acci) /break / 
15 79 case 4: s__out->put (accq) /break ; 

80 } 
81 

82 } //end for j 
83 

20 84 return 1; 
85 } 
86 
87 
88 

25 89 // 

90 // 

91 // 

92 // 

93 // 
30 94 // 

95 // 

96 // 



5. 9502848187909857e-03 
7.1303339418111898e-03 
-9 ,03761259588586526-04 
-1.2842591240125096e-02 
-1 . 6560488829370935e-02 
-3 .1424796453581099e-03 
2.2511451978267195e-02 
4 .0465840802261004e-02 
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97 // 


2 .8302892670230756e-02 




98 // 


-1.9056064640367836e-02 




99 // 


-7. 6814040516083 981e- 02 




100// 


-9.7464875081018337e-02 


5 


101// 


-3.7506670742425155e-02 




102// 


1.1136091774729967e-01 




103// 


3 .0772091871906165e-01 




104// 


4.7526468799142091e-01 




105// 


5.4107108989550989e-01 


10 


106// 


4.7526467788525789e-01 




107// 


3.0772090304860350e-01 




108// 


1.1136090307335493e-01 




109// 


-3 . 7506679314098741e-02 




no// 


-9.7464876235465986e-02 


15 


HI// 


-7. 6814036683689066e-02 




112// 


-1.9056059903703605e-02 




113// 


2.8302895170883653e-02 




114// 


4.0465840334864417e-02 




115// 


2.2511449901436539e-02 


20 


116// 


-3 . 1424813892788860e-03 




117// 


-1.6560489169667160e-02 




118// 


-1 . 284259044 0175973e-02 




119// 


-9. 0376032591496101e-04 




120// 


7.1303342199545879e-03 


25 


121// 


5 . 9502844100395589e-03 
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3 . 11 tx/tuplelize.h 

30 1 // tuplelize.h 

2 // All rights reserved Imec 1998 

3 // @{#) tuplelize.h 1.4 98/03/31 
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4 
5 

6#infdef TUPLELI ZE_H 
7 #de f i ne TUPLELI ZE_H 
5 8 

9#include "qlib.h" 
10 

11 class tuplelize 

12 FB *byte 
10 13 FB *symb 

14 double *qpsk 
15 

16 public: 

17 tuplelize (char* name, 
FB & _byte, 
FB & __symb, 
double &_qpsk) ; 

int run{) ; 
int reset (} ; 



public base{ 



15 18 

19 
20 
21 
22 

23 }; 
24 

25#endif 



3 .12 tx/ tuplelize. cxx 



1 // tuplelize. cxx 

2 // All rights reserved-- Imec 1998 

3 // @(#) tuplelize. cxx 1.698/03/31 
4 

5#include "tuplelize .h» 
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8 tuplelize : : tuplelize (char *name, 

9 FB & _byte, 

10 FB & _symb, 

11 double &_qpsk) : base (name) 
5 12 { 

13 byte = _byte .asSource (this) ; 

14 symb = _symb,asSink(this) ; 

15 qpsk = &_qpsk; 

16 } 
10 17 

18// 

19 

20 int tuplelize: : reset () { 

21 return 1; 
15 22 } 

23 

24 int tuplelize: : run () { 
25 

26 //firing rule 
20 27 if (byte->getSize() < 1) 
28 return 0; 
29 

30 //core func 

31 int us, msk, sym; 
25 32 

33 if ( (int) *qpsk) { 

34 us= 2; msk = 0x03; 

35 } else{ 

36 us= 4; msk = OxOF; 
30 37 } 

38 

39 int tuple = (int) Val (byte- >get ( ) ) ; 
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40 

41 for (int k = 1; k< = 8/us;k++) { 

42 sym = (tuple » (8-us) ) & msk; 

43 tuple= (tuple << us) & Oxff; 
5 44 symb->put (sym) ; 

45 } 
46 

47 return 1; 

48 } 
10 49 

50 
51 



15 4 Channel Model Code 

4.1 chan/fir.h 
1 // fir.h 

20 2 // All rights reserved -- Imec 1998 
3 // @(#)fir.h 1.2 03/31/98 
4 

5#infdef FIR__H 
6#define FIR_H 
25 7 

8#define NRTAPS 20 
9 

1 0 # i nc 1 ude " ql i b . h " 
11 

30 12 class fir : public base{ 

13 FB * input ; 

14 FB * output; 
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15 double x[NRTAPS] ; // filtertaps: 0, 1, . . . ,NRTAPS-1 

16 double *tl, *t2, *t3, *t20; 
17 

18 public: 

5 19 fir (char *name,FB & _J_nput,FB & __output, 

20 double &_tl, double &_t2, double &_t3, double &_t20) 

21 int run() ; 

22 int reset () ; 
10 23 }; 

24 

25#endif 

4.2 chan/fir.cxx 



20 5#include "fir.h" 
6 

7 f ir: : fir (char *name, 

8 FB & _input, 

9 FB & ^output, 

25 10 double &_tl, double &_t2, double &__t3, double 

& t20) : base (name) 



15 



1 // fir.cxx 



2 // All rights reserved Imec 1998 

3 // @(#)fir.cxx 1.3 03/31/98 



4 



11 { 



12 



input = __input . asSource (this) ; 



13 



output= _output .asSink(this) ; 



30 



14 



16 



15 



for (int i=0; i<NRTAPS; i++) { 
x [i] =0; 
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17 } 

18 tl = &_tl; 

19 t2 = &_t2; 

20 t3 = &_t3; 
5 21 t20 = &_t20; 

22 } 
23 

24 int f ir : :reset () { 

25 //return to initial state 

10 26 for (int i=0; i<NRTAPS; { 

27 x ti] =0; 

28 } 

2 9 return 1; 
30 } 
15 31 

32 int fir: :run{) { 

33 //firing rule 

34 if ( input ->getSize() < 1) { 

35 return 0; 
20 36 } 

37 

38 dfix in = input->get () ; 
39 

40 int i; 
25 41 for (i=NRTAPS-l; i>=l; i--) { 

42 x [i] =x[i-l] ; 

43 } 

44 x[0] =Val (in) ; 
45 

30 46 //core func 

47 double out = xfO] + x[l]*(*tl) +x[2]*(*t2) + 

x[3]*(*t3) + x[20] * (*t20) ; 
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4 8 output - >pu t ( out ) ; 
49 

50 return 1; 

51 } 
5 52 

53 

4.3 chan/noise.h 

10 1 // noise. h 

2 // All rights reserved Imec 1998 

3 // @(#)noise.h 1.2 03/20/98 
4 

5#infdef N0ISE_H 
15 6#define N0ISEJ5 
7 

8#include "qlib.h" 
9 # include "pseudorn . h" 
10 

20 11 class noise: public base{ 

12 FB * in; 

13 FB * out; 

14 double *n; 

15 pseudorn RN; 
25 16 

17 public: 

18 noise (char *name, FB & in,FB & out, double & __n) 

19 int reset () ; 

20 int run() ; 
30 21 }; 

22 

23#endif 
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4.4 chan/noise.cxx 

1 // noise. cxx 
5 2 // All rights reserved -- Imec 1998 
3 // @(#)noise.cxx 1.3 03/20/98 
4 

5 # i nc 1 ude " no i s e . h " 
6#include <math.h> 
10 7 

8 noise: : noise (char *name,FB & _in,FB & _out, double & _n) 

base (name) { 

9 in = _in.asSource (this) ; 
15 10 out= _out .asSink(this) ; 

11 n= &_n; 

12 } 
13 
14 

20 15 int noise:: run () { 

16 //firing rule 

17 if (in->getSize() < 1) { 

18 return 0; 

19 } 
25 20 

21 //core function 

22 double Ul = (double) (RN.out ())/ (double) PRNMAX + 
1/ (double) PRNMAX; 

23 double U2 = (double) (RN. out ())/ (double) PRNMAX + 
30 1/ (double) PRNMAX; 

24 

25 double X = sqrt (-2 . *log (Ul) ) *cos (2 . *M_PI*U2) ; 
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26 

27 out->put(Val(in->get())r+X*(*n) ) ; 
28 

29 return 1; 
5 30 
31 } 

4 . 5 chan/pseudorn . h 

10 1 // pseudorn.h 

2 // All rights reserved Imec 1998 

3 // @(#) pseudorn.h 1.2 03/31/98 
4 

5 # inf de f pseudorn_H 
15 6#define pseudorn_H 
7 

8#define MULT 0x015a4e35L 
9#define INCR 1 
10#define PRNMAX 32767 // =2*15-1 

20 11 

12#include <time.h> 
13 

14 class pseudorn { 

15 long seed; 

25 16 unsigned range; 

17 public: 

18 pseudorn () { 

19 range = PRNMAX; 

20 seed= time(0) ; 
30 21 } 

22 pseudorn (unsigned s, unsigned r) { 

23 seed= s; 
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24 ranges r; 

25 } 

26 pseudorn (unsigned r) { 

27 range = r; 

5 28 seed = time(O); 

29 } 

30 unsigned out (void ) { 

31 seed= MULT * seed+ INCR; 

32 return ((unsigned) (seed>> 16) & 0x7fff) % range 
10 33 } 

34 long getSeedO {return seed;} 

35 void setSeeddong s) {seed= s;} 

36 }; 
37 

15 38 

3 9 # inc 1 ude " ql ib . h " 
40 

41 class pseudorn __gen: publicbase { 

42 pseudorn RN; 
20 43 FB *out; 

44 public: 

45 pseudorn_gen(char *name, FB&_out) : 

46 base (name) , 

47 RN(255) { 

25 48 out= _ out .asSink(this) ; 

49 } 

50 int run() { 

51 out->put (RN.out ( ) ) ; 

52 return 1; 
30 53 } 

54 }; 
55 
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56#endif 

57 

58 

5 4.6 chan/pseudorn . cxx 

1 // pseudorn.cxx 

2 // All rights reserved -- Imec 1998 

3 // @(#)pseudorn.cxxl.l 03/17/98 
10 4 

5#include "pseudorn . h" 
6 

7 // inlinedstuf f 
8 

15 

5 System Code 

5 . 1 driver/driver . h 

20 

lttinfdef DRIVER_H 
2#define DRIVER_H 
3 

4 // @(#)driver.hl.2 98/03/20 
25 5 

6#include "qlib.h" 

7 # include " Cal lback2 wRe t . h " 

8 

9 class interpreter{ 
30 10 public: 

11 interpreter ( ) ; 

12 void add (sysgen &s ) ; 
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13 void observe (double &v,char *name) ; 

14 void obsAttr (Callback2wRet < int, double, int> 
cb, int , char 

*name) ; 

5 15 friend interpreter & operator<< (interpreter &p 

, sysgen &s) ; 

16 friend interpreter & operator<< (interpreter &p , elk 
Sec) ; 

17 void go (int argc,char **argv) ; 
10 18 }; 

19 
20 
21 
22 

15 23 

24#endif 

5 . 2 driver/driver . exx 

20 l#include "tcl.h" 

2#include <iostream.h> 
3 

4#define MAKE_WISH 
5 

25 6#ifdef MAKE_ WISH 
7#include "tk.h" 
8#endif 
9 

10 // @(#) driver. exx 1.3 98/03/27 
30 11 

12#include "qlib.h" 
13#include "qtb.h" 
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1 4 # inc 1 ude M driver „ h " 
15#include "Callback2wRet .h" 
16 

17// interpreter OCAPI -related datastructures 

5 // 

18 

19 Callback2wRet<int , double , int>f unctorlist [100] ; 

20 int numfunctors= 0; 
21 

10 22 int graphLines= 0; 
23 

24 FBQ (traceO) ; 

25 FBQ (tracel) ; 

26 FBQ (trace2) ; 
15 27 FBQ (trace3) ; 

28 FBQ (trace4) ; 

29 FBQ (traces) ; 

30 FBQ (trace6) ; 

31 FBQ (trace7) ; 

20 32 dfbfix *traces[8] ; 

33 dfbfix *tracedqueue [8] ; 
34 

35 Tcl_HashTable queue_hash; 
36 

25 37#define IF_SUFFIX(A) if ( (strlen (r->name () ) > 

strlen(A)) && 

( !strcmp(r->name() +strlen(r->name() ) - strlen (A) ,A) ) ) 

38 
39 

30 40 void create__queue_hash() { 

41 Tcl_InitHashTable ( &queue_hash , TCL_STRING__KEYS ) ; 
42 
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43 dfbfix *r; 

44 for(r = listOfFB; r; r=, r->nextFB () ) { 

45 int present; 

46 IF_SUFFIX ( "jnaark" } 
5 47 continue; 

48 IF__SUFFIX ( »_j3tim" ) 

49 continue; 

50 Tcl_SetHashValue (Tcl_CreateHashEntry (fcqueue Jiash, r- 
>name() ,&present) f (char *) r) ; 

10 51 } 
52 } 
53 

54 // next are created by the interpreter object itself 

55 Tcl__HashTable sched_hash; 
15 56 Tcl_HashTable doubles_hash; 

57 Tcl_HashTable attr__hashf unc ; 

58 Tcl_HashTable attrjiashint ; 
59 

60 elk* glbClk;// global (single) clock 
20 61 

62// 

---// 

63 int ListQueue(ClientData, Tcl_Interp*interp, intargc, 
char 

25 **argv) { 

64 if ( (argc > 2) ) { 

65 interp->result= "Usage :_listq_?queue?\n" ; 

66 return TCL_ERROR; 

67 } 
30 68 

69 char *match = 0; 

70 if (argc == 2) { 
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71 match = argv[l] 

72 } 



73 
74 
75 



if (match) { 



Tcl_HashEntry*p= 



Tel FindHashEntry (&queue_hash, argv [1] ) 



76 
77 



10 



if (p != 0) { 
Tcl_AppendElement (interp, 
Tcl_GetHashValue (p) ) - 
>name ( ) ) ; 



(d(fbfix*) 



78 



} 



15 



79 } else{ 

80 Tcl_HashSearch k; 

81 Tcl_HashEntry 
FirstHashEntry (&queue_hash,k&) ; 



*n= 



P= 



Tel 



82 
83 

84 

20 85 

86 
87 
88 

89 } 
25 90 



while (p != 0) { 

Tcl_AppendElement (interp, ( (dfbf ix *) 
Tcl_GetHashValue (p) ) ->name ( ) ) ; 
p = Tcl_NextHashEntry(&k) ; 

} 



} 



return TCL OK; 



30 



91// 

---// 

92 int GetQueue (ClientData , Tel _Interp 
argc , char 

**argv) { 

93 if (argc 1= 2) { 

94 interp- >result= "Usage :_getq_queue\n" ; 



interp, int 
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95 return TCLJERROR; 

96 } 
97 

98 Tcl_HashEntry*p 
5 Tcl__FindHashEntry ( fcqueuejiash, argv [1] ) ; 

99 if(p != 0) { 

100 dfbfix *q = (dfbfix *) Tcl_GetHashValue (p) ; 

101 while (q->getSize () ) { 

102 strstream N; 

10 103 N << Val (q->get () ) <<ends; 

104 Tcl_AppendElement (interp,N.str ( ) ) ; 

105 } 

106 } 
107 

15 108 return TCL_OK; 
109} 
110 

111 // 

// 

20 112 intPutQueue(ClientData , Tel _Interp * interp,int 
argc^har 

**argv) { 

113 if(argc != 3) { 

114 interp->result= "Usage : jputq_queue_value\n" ; 
25 115 return TCL_ERROR; 

116 } 
117 

118 Tcl_HashEntry *p 

Tcl_FindHashEntry (Scqueue^ash, argv [1] ) ; 
30 119 if (p 1= 0) { 

120 double v; 

121 sscanf (argv [2] , "%lf »,v&) ; 
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122 dfbfix *q = (dfbfix *) Tcl_GetHashValue (p) ; 

123 q->put (v) ; 

124 } 
125 

5 126 return TCL_OK; 
127} 
128 

129 // 

— -// 

10 130 int TraceQueue (ClientData, Tel _Interp * 
interp, intargc, char 
**argv) { 

131 

132 if((argc != l)&&(argc!= 3 )) { 
15 133 interp->result= 
"Usage :_traceq_?tracecL_queuename?\n" ; 

134 return TCL_ERROR; 

135 } 
136 

20 137 if(argc 1) { 

138 intk; 

139 for(k=0; k<8; k++) { 

140 strstream N; 

141 N << traces [k] ->name () <<"__"; 
25 142 if (tracedqueue [k] !=0) 

143 N << tracedqueue [k] ->name () ; 

144 N << ends; 

145 Tcl_AppendElement (interp^N. str ( ) ) ; 

146 } 

30 147 } else{ 

148 Tcl_HashEntry *p= 

Tcl_FindHashEntry { &queue_hash, argv [2 ] ) ; 
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149 dfbfix *q = 0; 

150 if (p != 0) { 

151 q = (dfbfix *) Tcl_GetHashValue (p) ; 

152 } else { 

153 return TCL_OK; 

154 } 
155 

156 int num; 

157 for (num=0; num < 8;num++) { 

158 if ( Jstrcmp(argv[l] , traces [num] ->name () ) ) 

159 break; 

160 } 
161 

162 if (num > 7) 

163 return TCL_0K; 
164 

165 if (tracedqueue [num] ! =0) { 

166 tracedqueue [num] ->asDup (nilFB) ; 

167 } 
168 

169 tracedqueue [num] =q; 

170 q->asDup(* traces [num] ) ; 

171 } 

172 return TCL_0K; 
173} 

174 

175 // 

---// 

176 intReadQueue(ClientData , Tcl_Interp * interp r intargc, 
char 

**argv) { 

177 if (argc != 2) { 
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178 interp->result= "Usage :_readq_queue\n" ; 

179 return TCL_ERROR; 

180 } 
181 

182 Tcl_HashEntry *p 

Tel FindHashEntry (&queue_hash, argv [1] ); 



dfbfix *q = (dfbfix *) Tcl_GetHashValue (p) ; 
int k; 

for(k=0; k<q->getSize ( ); k++) { 
strstream N; 
N « Val((*q) [k]) « ends; 
Tcl_AppendElement (interp, N. str ( ) ) ; 

} 





183 




184 




185 


10 


186 




187 




188 




189 




190 


15 


191 




192 




193 




194 




195 








-// 
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char 


25 


198 




199 




200 




201 




202 


30 


203 




204 




205 



^argv) { 



interp- >result= "Usage :_plotq_queue_? . . . ?\n" ; 
return TCL ERROR; 
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206 

207 // headers 

208 PLOTBUF < < " Ti 1 1 eText : _ " ; 

209 for(i=l; i<argc; i++) { 

5 210 Tcl_HashEntry *p= 

Tc^FindHashEntryt&queue^ash^rgvli] ) ; 

211 if (p != 0) 

212 PLOTBUF « ( (dfbf ix *) Tcl_GetHashValue (p) ) ->name ( ) 
<<"_" ; 

10 213 } 

214 PLOTBUF « "\n"; 
215 

216 PLOTBUF << " BackGround:__Black\n" ; 

217 PLOTBUF << " ForeGround :_White\n u ; 
15 218 PLOTBUF << "XUnitText: Sample\n" ; 

219 PLOTBUF << "BoundBox: True\n" ; 

220 PLOTBUF << "0. Color: Yellow\n" ; 

221 PLOTBUF << "LabelFont : ~adobe-helvetica-*-r-*-*-16-*- 

20 *-*\n»; 

222 PLOTBUF << "Markers: True\n" ; 

223 if( IgraphLines) 

224 PLOTBUF << "NoLines : True\n H ; 

225 

25 226 // data 

227 for(i=l; i<argc; i++) { 

228 PLOTBUF << "\n" ; 

22 9 Tcl_ HashEntry *p= 
Tcl_FindHashEntry {&queue_hash, argv [i] ) ; 
30 230 if(p i- 0) { 
231 int j; 
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232 PLOTBUF << "\""<< (( dfbfix*) Tcl_GetHashValue (p) ) - 
>name ( ) 

«"\ H \n"; 

233 for (j = 0; j <( (dfbfix*) Tcl_GetHashValue (p) ) - 
5 >getSize ( ) ; 

{ 

234 PLOTBUF << j « "_"<< ((dfbfix 
*) Tcl_GetHashValue (p) ) - 

>ge t Index ( j ) <<"\n" ; 
10 235 } 

236 } 

237 } 

238 PLOTBUF. close () ; 
239 

15 240 system(strapp(strapp("xgraph_" , f) , "_&") ) ; 
241 return TCL__OK; 
242} 
243 

244 // 

20 // 

245 int ScatQueue (ClientData, Tel _Interp * interp, intargc, 
char 

**argv) { 

246 int i; 

25 247 if(argc I- 3) { 

248 interp->result= "Usage :__scatq_queuex_queuey\n" ; 

249 return TCL_ERROR; 

250 } 
251 

30 252 of stream PLOTBUF (" .plot buf 11 ) ; 
253 

254 // headers 
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255 PLOTBUF « "TitleText ; 

256 for(i=l; i<argc; i++) {• 

257 Tcl_HashEntry *p 
Tcl_FindHashEntry (fcqueuejiash, argv [i] ) ; 

5 258 if(p != 0) 

259 PLOTBUF « ( (dfbf ix *) Tcl__GetHashValue (p) ) ->name ( 
<<»_"; 

260 } 

261 PLOTBUF << "\n"; 
10 262 

263 PLOTBUF « !l BackGround:_Black\n" ; 

264 PLOTBUF « "ForeGround:_White\n !l ; 

265 PLOTBUF « "XUnitText: Sample\n" ; 

266 PLOTBUF << "BoundBox: True\n" ; 

15 267 PLOTBUF << "0. Color: Yellow\n"; 

268 PLOTBUF < < " Labe 1 Font : - adobe - he 1 ve t ica-*-r-*-*-16-* 

*_*_*-*_ 

*-*\n»; 

269 PLOTBUF << "Markers: True\n" ; 

20 270 if ( IgraphLines) 

271 PLOTBUF << "NoLines: True\n" ; 

272 

273 // data 

274 PLOTBUF « "\n"; 

25 275 Tcl_ HashEntry * pi 

Tcl_FindHashEnt ry ( Scqueue_ hash , argv [ 1 ] ) ; 

276 Tcl_HashEntry * p2 
Tcl_FindHashEntry ( ^queue^hash, argv [2 ] ) ; 

277 if((pl != 0)&&(p2 != 0)) { 
30 278 int j; 

279 int max = ( (dfbf ix *) Tcl_GetHashValue (pi) ) 

>getSize ( ) ; 
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280 if ( ( (dfbf ix *) Tcl_GetHashValue (p2) ) ->getSize () 
<max) { 

281 max = ( ( (dfbf ix *) Tcl__GetHashValue (p2) ) ->getSize ( 

) ) ; 
5 282 } 

283 for(j=0; j<max; j++) { 

284 PLOTBUF << ( (dfbf ix *) Tcl_GetHashValue (pi) ) - 
>get Index (j ) 

285 << "_ H 

10 286 « ( (dfbf ix *) Tcl_GetHashValue (p2) ) - 

>getlndex( j ) <<"\n" ; 

287 } 

288 } 

289 PLOTBUF. Close () ; 
15 290 

291 system ("xgraph__.plotbuf_&") ; 

292 return TCLJDK; 
293} 

294 

20 295 // 

— // 

296 int StatQueue (ClientData, Tel _Interp*interp, intargc, 
char 

**argv) { 
25 297 if (argc > 2) { 

298 interp->result= "Usage :_statq_?queue?\n" ; 

299 return TCL_ERROR; 

300 } 
301 

30 3 02 char *match = 0; 
3 03 if (argc ==2) { 
304 match = argv[l] ; 
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305 } 




306 




307 




308 


5 


309 




310 




311 




312 




313 


10 


314 




315 




316 




317 




318 



IF_SUFFIX ( "_mark" ) 

continue; 
IF_SUFFIX ( "_stim" ) 

continue; 

if ( ! match || (s ! trcmp (r->name ( ), match))) { 
strstreamN; 
N « *r << ends; 

Tcl_AppendElement (interp,N.str ( ) ) ; 

} 



15 319 } 
320 

321 return TCL_OK; 

322} 

323 

20 324 // 

— // 

325 int ClearQueue (ClientData, Tel _Interp*interp, intargc, 
char 

**) { 

25 326 if(argc > 1) { 

327 interp->result= "Usage :_clearq\n" ; 

328 return TCL__ERROR ; 

329 } 
330 

30 331 dfbfix *r; 

332 for(r = listOfFB; r; r= r->nextFB()) 

333 while (r->getSize ( ) >0 ) 
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334 r->pop() ; 

335 

336 return TCL_OK; 
337} 
5 338 

339 // 

----// 

340 int ListSchedule (ClientData, Tel _Interp*interp, 
intargc, char 

10 **argv) { 

341 if ((argc > 2)) { 

342 interp->result= "Usage :_lists_?schedule?\n" ; 

343 return TCL_ERROR ; 

344 } 
15 345 

346 char *match = 0; 

347 if (argc == 2} { 

348 match = argvfl] ; 

349 } 
20 350 

351 if (match) { 

352 Tcl_HashEntry *p= Tcl_FindHashEntry (&sched 
_hash f argv[l] ) ; 

353 if(p != 0) { 

25 354 Tcl_AppendElement (interp, ( (sysgen *) 

Tcl__GetHashValue (p) ) ->getname ( ) ) ; 

355 } 

356 } else{ 

357 Tcl_HashSearchk ; 

30 358 Tcl_HashEntry * p= Tel _FirstHashEntry (&sched 

_hash, k&) ; 

359 while (p != 0) { 
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360 Tcl__AppendElement (interp, ( (sysgen*) 
Tcl_GetHashValue (p) ) - 

>getnatne ( ) ) ; 

361 p - Tcl_NextHashEntry(&k) ; 
5 362 } 

363 } 
364 

365 return TCL_OK; 
366} 
10 367 

368 // 

--// 

369 int RunSchedule (ClientData, Tel __Interp* interp , intargc, 
char 

15 **argv) { 

370 

371 if((argc 1= 3)) { 

372 interp- >result= 
"Usage :_runs_schedule_clock_iterations\n" ; 

20 373 return TCL_ERROR; 
374 } 
375 

376 Tcl_HashEntry *p « Tcl__FindHashEntry (fcsched 

_hash, argv [1] ) ; 
25 377 if (p != 0) { 

378 unsigned v; 

379 sscanf (argv[2] , H %d",&v); 

380 sysgen *sys = (sysgen *) Tcl__GetHashValue (p) ; 
381 

30 382 while (v--> 

383 sys->run(*glbClk) ; 

384 



213 



385 } 
386 

387 return TCL_OK; 
388} 
5 389 

390 // 

— // 

391 int VhdlSchedule (ClientData, Tel __Interp *interp, 
intargc, char 

10 **argv) { 

392 

393 if ( (argc 1= 2) ) { 

394 interp->result= "Usage :__vhdls_schedule\n" ; 

395 return TCL_ERROR ; 
15 396 } 

397 

398 Tcl__HashEntry*p = TclJFindHashEntry (&sched 
_hash, argv [1] ) ; 

399 if (p != 0) { 

20 400 sysgen *sys = (sysgen *) Tcl_GetHashValue (p) ; 

401 sys->vhdlook() ; 

402 } 
403 

404 return TCLJDK; 
25 405} 
406 

407 // 

// 

408 int List Parameter (ClientData, Tcl_Interp*interp / int 
3 0 argc , char 

**argv) { 

409 if ( (argc > 2) ) { 



214 

410 interp->result= "Usage ;_listp__?parameter?\n" ; 

411 return TCL_JERROR; 

412 } 
413 

5 414 char *match = 0; 

415 if(argc 2) { 

416 match = argvfl] ; 

417 } 
418 

10 419 if (match) { 

420 TclJttashEntry *p 
Tcl_FindHashEntrY(&doubles_hash # argv[l] ) ; 

421 if (p != 0) { 
422 

15 Tcl_AppendElement (interp / Tcl_GetHashKey (&doubles_hash f p) ) ; 

423 } 

424 } else{ 

425 Tc l_HashSearchk ; 

426 Tcl_HashEntry *p 
20 Tcl_PirstHashEntry (&doubles__hash / k&:) ; 

427 while (p 1= 0) { 
428 

Tcl_AppendElement (interp, Tcl_GetHashKey (Scdoubles^hash^) ) ; 
429 p = Tcl_NextHashEntry (&k) ; 

25 430 } 
431 } 
432 

433 return TCL_OK; 
434} 

30 435 // 

— // 
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436 int Set Parameter (CI ientData, Tel _Interp *interp, 
intargc, char 

**argv) { 

437 if ( (argc != 3) ) { 

5 438 interp->result= "Usage :_setp_parameter_value\n" ; 

439 return TCLJERROR; 

440 } 
441 

442 Tcl_HashEntry *p ' = 
10 Tcl__FindHashEntry( &double s_hash # argv [1] ) ; 

443 if(p != 0) { 

444 double v; 

445 sscanf (argv [2] , "%lf",&v); 

446 double *q = (double *) Tcl_GetHashValue (p) ; 
15 447 *q = v; 

448 } 
449 

450 return TCL_OK ; 
451} 
20 452 

453 // 

---// 

454 int ReadParameter (ClientData, Tcl_Interp *interp, int 
argc , char 

25 **argv) { 

455 if (argc != 2) { 

456 interp->result= "Usage :_readp_parameter\n" ; 

457 return TCL_ERR0R; 

458 } 
30 459 

460 Tcl_HashEntry *p 

Tcl_FindHashEntry (&doubles__hash r argv [1] ) ; 
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461 if (p 1= 0) { 

462 double *q = (double *) Tcl_GetHashValue (p) ; 

463 strstreamN; 

464 N << *q << ends; 

5 465 Tcl_AppendElement (interp, N.str ( ) } ; 
466 } 
467 

468 return TCL_0K; 
469} 
10 470 

471 // , 

---// 

472 int ListAttribute(ClientData,Tcl _Interp *interp,int 
argc, char 

15 **argv) { 

473 if((argc > 2)) { 

474 interp->result= "Usage :_lista_?attribute?\n" ; 

475 return TCL_ERROR ; 

476 } 
20 477 

478 char *match = 0; 

479 if (argc ==2) { 

480 match = argv[l] ; 

481 } 
25 482 

483 if (match) { 

484 Tcl_HashEntry *p= 
Tcl_FindHashEntry (&attr_hashf unc, argvfl] ) ; 

485 if(p i= 0) { 
30 486 

Tcl_AppendElement (interp, Tcl_GetHashKey (&attr_hashfunc,p) ) ; 
487 } 
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488 } else{ 

489 Tcl_HashSearchk; 

490 Tcl_HashEntry *p= Tel _FirstHashEntry (fcattr 
_Jiashfunc, &k) ; 

5 491 while (p != 0) { 
492 

Tcl_AppendElement (interp, Tcl_GetHashKey (&attr_hashfun"c, p) ) ; 

493 p = Tcl_NextHashEntry(&k) ; 

494 } 
10 495 } 

496 

497 return TCL_OK; 

498} 

499 

15 500 // 

--// 

501 int SetAttribute(ClientData,Tcl_Interp *interp, 
intargc, char 
**argv) { 
20 502 if((argc i= 3)) { 

503 interp- >result= "Usage :_j3eta_attribute_value\n" ; 

504 return TCL_ERROR; 

505 } 
506 

25 507 Tcl_HashEntry *pf= 

Tcl_FindHashEntry (&attr_hashfunc f argvfl] ) ; 

508 Tcl_HashEntry *pi = 

Tcl_FindHashEntry(&attr_hashint,argvtl] ) ; 
509 

30 510 if(pf i= 0) { 

511 int n = (int) Tcl_GetHashValue (pi) ; 

512 double v; 
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513 sscanf (argv [2] , "%lf " , &v) ; 

514 //call member func 

515 functorlist [ (int) TcljSetHashValue (pf ) ] (n,v) ; 

516 } 
5 517 

518 return TCL_OK; 

519} 

520 

521 // 

10 // 

522 int SetLineStyle (ClientData, Tcl__Interp *interp, 
intargc, char 

**argv) { 

523 if((argc i= 2)) { 

15 524 interp->result= "Usage :_lines_l/0\n" ; 

525 return TCL_ERR0R; 

526 } 
527 

528 int v; 
20 529 sscanf (argv [1] ,»%d", &v) ; 

530 if(v i» 0) 

531 graphLines= 1; 

532 else 

533 graphLines= 0; 
25 534 

535 return TCL_OK; 

536} 

537 

538 // 

30 // 

539 int Testbenches (ClientData, Tcl__Interp *interp, intargc, 
char 
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**argv) { 

540 if((argc != 2)) { 

541 interp->result= "Usage :_testb_l/0\n" ; 

542 return TCL__ERROR ; 
5 543 } 

544 

545 int v; 

546 sscanf (argv[l] , "%d», &v) ; 

547 if (v != 0) 

10 548 qtb: :glbDisableTestbenches=0 ; 

549 else 

550 qtb: :glbDisableTestbenches=l ; 
551 

552 r e t urn TCL_OK ; 
15 553} 
554 

555 // 

~// 

556 int OCAPIHelp(ClientData, Tcl_Interp *interp,int, char 
20 **) { 

557 Tcl_AppendElement (interp, "AvailablejDCAPI- 
related_commands : \n" ) ; 

558 

Tcl_AppendElement (interp, "listq_?queuejname?__ 

25 

List_queue (s) \n") ; 
559 Tcl_AppendElement (interp, "statq_?queue_name? 



Queue (s)_statistics\n") ; 

30 560 

Tcl_AppendElement (interp, " readq_queue_name 
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Return _queue_contents\n") ; 

561 

Tcl_AppendElement (interp, "getq queuejname 

5 Return _and_empty_queue__contents\n" ) ; 

562 

Tcl_AppendElement (interp, "putq queue_name_value_ 



Add_value__to__queue\n") ; 
10 563 Tcl_AppendElement (interp, ,! plotq_queue_name_? . . . ? 



Display_queue_contents_graphically\n" ) ; 
564 Tcl_AppendElement (interp , » scatq_queue_name_queue_name_ 



15 Display__queue_contents_graphically\n") ; 

565 Tcl__AppendElement (interp, " traceq_?tracenum_queue__name? 



Trace_writes_to_the_queue\n") ; 
566 Tcl_AppendElement (interp, ,r clearq_ 



20 



Clears_contents_of_queues\n") ; 

567 

Tcl__AppendElement (interp, "lists ?schedule name? 



25 List_available_schedules\n M ) ; 

568 

Tcl_AppendElement (interp, "runs_schedule_name_iter 
Runs_iter_iterations_of__a_schedule\n" ) ; 

30 569 



Tcl__AppendElement (interp, "vhdls schedule 



name 
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Duraps_VHDL_code__f or_a_schedule\n" ) ; 
570 Tcl_AppendElement (interp, "listp_?parameter_name? 

List__parameters\n") ; 
5 571 Tcl__AppendElement (interp, " setp_parameter_name_value 

Listjparameters\n") ; 
572 Tcl_AppendElement (interp, "readpjparameterjiame 

10 Return _Variable_Contents\n" ) ; 

573 

Tcl_AppendElement (interp, "listaJPat tribute name? 



List_attributes\n") ; 
15 573 Tcl_AppendElement (interp, »seta_at tribute name value 



Set_attribute\n») ; 
574 Tcl_AppendElement (interp, "lines_l/0 



2 0 Turns__on/of f__l ine__drawing\n ,r ) ; 

575 Tcl_AppendElement (interp, "testb_l/0 



Disables_test_benches\n") ; 
577 return TCL_OK; 
25 578} 
579 

580 // 

// 

581 // intialization and command definition 
30 582 int Applnit (Tel _Interp *interp) { 

583 

584 if( Tcl_Init (interp) = = TCL_ERR0R ) 
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585 return TCL__ERROR ; 
586 

587#ifdef MAKE_WISH 

588 if (TkJEnit (interp) ==TCL_ERROR) 
5 589 return TCL_ERROR; 
590#endif 
591 

592 create_queue _hash(); 
593 

10 594 Tcl_CreateCommand( interp, "listq" ,ListQueue, NULL, 
NULL) ; 

595 Tcl_CreateCommand(interp, "statq" , StatQueue, NULL, 
NULL) ; 

596 Tcl_CreateCommand (interp, "readq" , ReadQueue , NULL, 
15 NULL) ; 

597 Tcl_CreateCommand( interp, "getq", GetQueue, NULL, 
NULL) ; 

598 Tcl_CreateCommand (interp, "putq", PutQueue, NULL, 
NULL) ; 

20 599 Tcl_CreateCornmand (interp, "plotq" , Plot Queue, NULL, 
NULL) ; 

600 Tcl_CreateCommand (interp, "scatq" , ScatQueue, NULL, 
NULL) ; 

601 Tcl^CreateCommand (interp, H traceq" , TraceQueue, NULL, 
25 NULL) ; 

602 Tcl_CreateCommand (interp, "clearq" , ClearQueue, NULL, 
NULL) ; 

603 

604 Tcl_CreateCommand (interp, "lists" , ListSchedule, NULL, 
30 NULL) ; 

605 Tcl_CreateCommand (interp, "runs", RunSchedule, NULL, 
NULL) ; 
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606 TclJTreateCommand (interp, "vhdls" , VhdlSchedule, NULL, 

NULL) ; 

607 

608 Tcl_CreateCommand(interp, "listp" , ListParameter , NULL, 
5 NULL) ; 

609 Tcl_CreateCommand(interp, "setp", SetParameter, NULL, 
NULL) ; 

610 Tcl_CreateCommand (interp, "readp" , ReadParameter,NULL, 
NULL) ; 

10 611 

612 Tcl_CreateCommand (interp, "lista" , ListAttribute, NULL, 
NULL) ; 

613 Tcl_CreateCommand (interp, "seta", SetAttribute, NULL, 
NULL) ; 

15 614 

615 Tcl_CreateCommand ( interp, "testb" , Testbenches, NULL, 
NULL) ; 

616 Tcl_CreateCommand (interp, "lines", SetLineStyle, NULL, 

NULL) ; 

20 617 Tcl_CreateCommand (interp, "OCAPI » , OCAPIHelp, NULL, 
NULL) ; 
618 

619 return TCL_OK; 
620} 
25 621 
622 

623 // 

// 

624 

30 625 interpreter & operator« ( interpreter &p, sysgen &s ) { 

626 p.add(s); 

627 return p; 
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628} 
629 

630 interpreter & operator<< ( interpreter &p, elk &ck) { 

631 glbClk= &ck; 
5 632 return p; 

633} 
634 

635 void interpreter: : observe (double &v,char *name) { 
63 6 int present; 

10 637 Tcl_SetHashValue (Tcl_CreateHashEntry (&doubles_hash, na 
me, 

^present) , (char*) &v) ; 

638} 
639 

15 * 40 void 
interpreter : : obsAttr (Callback2wRet<int , double , int >f , int 
n, char *name) { 

641 int present; 

642 functorlist [numf unctors++] =f ; 
20 643 if (numfunctors>100) { 

644 cerr« » ***_ERROR:_max_num_f unctors_exceeded\n" ; 

645 exit(0); 

646 } 

647 Tcl_SetHashValue(Tcl_CreateHashEntry(&attr_hashfunc f n 
25 ame, 

^present) , (char *) numfunctors-1) ; 

648 Tcl_SetHashValue (Tcl_CreateHashEntry (&attr_hashint , na 
me, 

^present) , (char *)n) ; 

30 649} 
650 

651 interpreter: interpreter () { 



225 



652 Tel _InitHashTable(&sched_hash / TCL_STRING_KEYS) ; 

653 Tel _InitHashTable(&dou'bles_hash,TCL_STRING_KEYS) ; 

654 Tel ^InitHashTablef&attr^hashfuncTCL^STRING^KEYS) ; 

655 Tel _InitHashTable(&attr_hashint,TCL_STRING_KEYS) ; 

656 numfunctors= 0; 
] = fctraceO; tracedqueue [0] = &nilFB; 
] = &tracel; tracedqueue [1] = &nilFB; 
] = &trace2; tracedqueue [2] = &nilFB; 
] = &trace3; tracedqueue [3] = &nilFB; 
] = &trace4; tracedqueue [4] = &nilFB; 
] = &trace5; tracedqueue [5] = &nilFB; 
] = &trace6; tracedqueue [6] = &nilFB; 
] = &trace7; tracedqueue [7] = &nilFB; 



657 traces [0 

658 traces [1 

659 traces [2 

660 traces [3 
10 661 traces [4 

662 traces [5 

663 traces [6 

664 traces [7 



= &trace7; 
665} 
15 666 

667 void interpreter :: add (sysgen &s) { 

668 int present; 

669 Tcl_SetHashValue (Tcl__CreateHashEntry (&sched__hash, s .get 
name ( ) , 

20 &present) , (char *) &s) ; 

670} 
671 

672 void interpreter: : go (int argc, char **argv) { 
673#ifdef MAKE_WISH 
25 674 Tk_Main(argc,argv, Applnit); 
675#else 

676 Tcl__Main(argc,argv, Applnit); 
677#endif 
678 
30 679} 
680 
681 
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5 . 3 driver/sys . cxx 

1 // sys.cxx 
5 2 // All rights reserved -- Imec 1998 
3 // @{#)sys.cxx 1.5 98/03/31 
4 

5 # inc 1 ude " ql ib . h " 
6#include "hshake.h" 



10 7#include "driver. h" 



8#include "sys.h" 
9 






10 double glbQPSK 


0. 


; // for QPSK -> 1 


11 double glbDiff 


0. 


; // for Diff Enc-> 


15 12 double glbTl 


0. 




13 double glbT2 


0. 




14 double glbT3 


0. 




15 double glbT20 


0. 




16 double glbNoiseLevel= 


0. 




20 17 double glbADWbits 


10. 


/ 


18 double glbADLbits 


6. 


} 



19 

20 int main(int argc, char **argv) { 
21 

25 22 LOADTYPES ( . . / rx/TYPEDEF) ; 
23 

24 //global synchronous clock 

25 clkck; 
26 

30 27 // 

28 // 

29 //byte source 



227 





30 


// 




31 


FBQ ( tx _bytes ) ; 




32 


pseudorn _gen GEN_RN ( "gen rx" 




33 


tx_bytes) ; 


5 


34 






35 


sysgen GEN ( "GEN" ) ; 




36 


GEN « GEN_RN; 




37 






38 




10 








39 


// 




40 


//transmitter 




41 


// 




42 


FBQ( tx_rnd_bytes) ; 


15 


43 


FBQ(tx_symbols ) ; 




44 


FBQ ( tx_dif_symbols) ; 




45 


FBQ ( tx_ival ) ; 




46 


FBQ ( tx_qval ) ; 




47 


FBQ( tx_sig ) • 


20 


48 


FBQ ( tx_sig_quant) ; 




49 






50 


rnd TX_RND ("tx_derandm 




51 


tx_bytes, 




52 


tx_rnd_bytes) ; 


25 


53 


tuplelize TX_TUPLE ( » tx_tuple " , 




54 


tx_rnd_bytes, 




55 


tx_symbols, 




56 


glbQPSK) ; 




57 


dif f enc TX_DIFFE ( "tx_dif f e" , 


30 


58 


tx_symbols, 




59 


tx_dif_symbols , 




60 


glbQPSK, 
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61 glbDiff ) ; 

62 map TX_MAP ("tx_map", 

63 tx_dif_symbols, 

64 tx_ival , 
5 65 tx_qval, 

66 glbQPSK) ; 

6 7 shape TXJ3HAPE ( " tx_shape '» , 

68 tx_ival, 

69 tx_qval, 
10 70 tx__sig) ; 

71 ad TX_AD ( " t x_ad " , 

72 tx_sig, 

73 tx_sig_quant, 

74 glbADWbits, 
15 75 glbADLbits) ; 

76 

77 sysgen TX("TX") ; 

78 TX « TX_RND; 

79 TX << TXJTUPLE; 
20 80 TX << TXJDIFFE; 

81 TX « TXJYIAP; 

82 TX « TX_SHAPE; 

83 TX « TX_AD; 
84 

25 85 // 



86 // 

87 //channel 

88 // 

30 89 FBQ ( chan_isi); 

9 0 FBQ ( chan_ou t ) ; 
91 
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92 fir CHAN_FIR ( " chan_f ir n , 

93 tx_sig_quant , » 

94 chan_isi, 

95 glbTl, 
5 96 glbT2, 

97 glbT3, 

98 glbT20) ; 
99 

100 noise CHAN_NOISE ( "chanjioise" , 
10 101 chan_isi, 

102 chan__out, 

103 glbNoiseLevel) ; 
104 

105 sysgen CHAN ("CHAN") ; 
15 106 CHAN « CHAN_FIR; 

107 CHAN « CHANJSOISE; 
108 

109 // 



20 110 // 

111 //receiver 

112 // 

113 FBQ(rx_constel_mode) ; 

114 FBQ (rx_lms_i) ; 
25 115 FBQ(rx_lms_q) ; 

116 FBQ (rx_symtype) ; 

117 lmsff RX__LMSFF("lmsff " , 

118 ck, 

119 rx_constel__mode, 
30 120 chan_out, 

121 

122 rx 1ms i, 
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123 rx_lms_q, 

124 rx_symtype 

125 ) ; 
126 



5 


127 


RX_LMSFF. 


setAttr 


(lmsff : 


: FWLENGTH, 


8 




) 




128 


RX_LMSFF. 


setAttr 


(lmsff: 


:STEP_PAR, 


4 




) 




129 


RX_LMSFF . 


setAttr 


(lmsff: 


:P0, 


-0 


.2*2 


.0) 




130 


RX_LMSFF. 


setAttr 


(lmsff: 


:P1, 


0 


.7*2 


.0) 




131 


RX_LMSFF. 


setAttr 


(lmsff: 


:P2, 


0 


.7*2 


.0) 


10 


132 


RX_LMSFF . 


setAttr 


(lmsff: 


:P3, 


-0 


.2*2 


.0) 




133 


RX_LMSFF . 


setAttr 


(lmsff: 


:REF, 


3 


.0 


) 




134 


RX_LMSFF. 


setAttr 


(lmsff: 


:INIT 






) 




135 


RX_LMSFF. 


setAttr 


(lmsff: 


:SPS_PAR, 


4 




) 



136 

15 137 FBQ(rx_symtype_at) ; 

138 FBQ ( rx_dif f_mode) ; 

139 FBQ(rx_symbol) ; 

140 demap RX_DEMAP ( "demap" , 

141 ck, 

20 142 rx_symtype / 

143 rx_di f f _mode , 

144 rxJLms_i, 

145 rx „l ms __<2f 
146 

25 147 rx_symtype_at , 

148 rx_symbol 

149 ) ; 
150 

151 RX_DEMAP. setAttr ( demap :: DEBUGMODE , 0 ) ; 
30 152 RX__DEMAP . setAttr (demap : : REF , 3 . 0 ) ; 
153 

154 FBQ{ rx_syncro) ; 
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155 FBQ ( rx_byte _rnd) ; 

156 de tupl eRX_DETUPLE ( "detuple" , 



157 ck, 

158 rx_symbol , 

5 159 rx_symtype _at, 
160 

161 rx__byte _rnd, 

162 rx_syncro 

163 ) ; 
10 164 



165 RX_DETUPLE . se t At tr ( detuple ; D : EBUGMODE , 0 ) ; 
166 

167 FBQ ( rx_byte_out) ; 

168 FBQ ( rx__sync_out) ; 

15 169 derandRXJDERAND ( "derand" , 

170 ck, 

171 rx_by t e_r nd , 

172 rx_syncro, 
173 

20 174 rx_by t e_ou t , 

175 rx_sync_out 

176 ) ; 
177 

178 RX — DERAND . setAt tr (derand : : DEBUGMODE ,0 ) ; 
25 179 RX_DERAND. setAttr (derand: ; SEED, 0x3f ) ; 

180 RX^DERAND . setAttr (derand : : BYPASS , 0 ) ; 
181 

182 sysgen RXJJT ( "RX_UT" ) ; 

183 RX__UT << RX_LMSFF; 
30 184 RX__UT « RX_DEMAP; 

185 RX_JJT << RX_DETUPLE; 

186 RXJJT « RX_DERAND; 
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187 

188 // *---clocktrue definition 

189 handshake hskl ( "hi " , ck) ; 

190 handshake hsk2 ( " h2 " , ck) ; 
5 191 handshake hsk3 ("h3",ck) ,- 

192 

1 93 rx_lms_i . sethandshake (hskl ) ; 

1 94 rx_symbol . sethandshake (hsk2 ) ; 

195 rx_byte_rnd. sethandshake (hsk3) ; 
10 196 

197 RX_LMSFF . de f ine ( ) ; 

198 RXJDEMAP .define (); 

199 RX_DETUPLE. define () ; 

200 RX_DERAND .define(); 
15 201 

202 sysgen RX_TI ( "RX_TI " ) ; 

203 RX_TI << RX_LMSFF .fsm(); 

204 RX_TI << RX_DEMAP .fsm(); 

205 RX_TI << RX_DETUPLE.fsm() ; 
20 206 RX_TI « RX_DERAND .fsm(); 

207 

208 // iopad definition 

209 dfix T_byte (0,8,0) ; 

210 RX_TI . inpad (chan_out , T (T_sample_lms) ) ; 
25 211 RX_TI . inpad (rx_di f f _mode , T_bit ) ; 

212 RX_TI . inpad (rx_constel_mode, T_bit) ; 

213 RX_TI.outpad(rx_byte_out, T byte) ; 

214 RX_TI . outpad ( rx_sync_out , T_bi t ) ; 
215 

30 216 //--- insert clear registersstate 

217 RX_LMSFF . f sm ( ) . clear_regs ( ) ; 

218 RX_DEMAP . f sm ( ) . clear_regs ( ) ; 
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219 RX_DETUPLE.fsm() .clear_regs () ; 

220 RX_DERAND . f sm() . clear_regs () ; 
221 

222 // testbench generator for this clocktrue model 

5 223 RX_LMSFF . f sm ( ) . tb ^enable ( ) 

224 RX_DEMAP . f sm ( ) . tb __enable { ) 

225 RX_DETUPLE . f sm ( ) . tb _enable() 

226 RX_DERAND .fsmO.tb _enable() 

227 RXJTI .tb _enable() 
10 228 RX_TI .generate () ; 

229 

230 // 



231 

15 232 
233 
234 
235 
236 

20 237 
238 
239 
240 
241 

25 242 
243 
244 
245 
246 

30 247 
248 
249 



// 

//interpreter 
II 

interpreter P; 
P « GEN; 
P << TX; 
P « CHAN; 
P << RXJJT; 
P « RXJTI; 
P « ck; 

P . observe (glbQPSK 
P . observe (glbTl 
P . observe (glbT2 
P . observe (glbT3 
P . observe (glbT20 



, "QPSK" 
, "Tl n 
, »T2" 
, "T3 " 
, "T20" 



P. observe (glbNoiseLevel, "NoiseLevel " ) 
P. observe (glbADWbits , "ADWbits" ) 
P. observe (glbADLbits , "ADLbits" ) 



250 P. observe (glbDiff 
251 

252 P. ATTRIBUTE ( lms ff ,RX 

253 P. ATTRIBUTE ( Imsf f , RX 
5 254 P. ATTRIBUTE (lms ff ,RX 

255 P. ATTRIBUTE ( lms ff ,RX 

256 P. ATTRIBUTE (lmsff ,RX 

257 P. ATTRIBUTE (lmsff ,RX 

258 P. ATTRIBUTE (lmsff , RX_ 
10 259 P . ATTRIBUTE (derand , RX 

260 P. ATTRIBUTE (derand, RX 
261 

262 P.go(argc,argv) ; 
263 
15 264} 
265 
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, "Dif fEnc" ) ; 



LMSFF 


, FWLENGTH 


, Imsf f__fwlen) ; 


LMSFF 


, STEP_PAR 


, Imsf f _step) ; 


LMSFF 


,P0 


, Imsf f_p0 ) ; 


LMSFF 


,P1 


, lmsff__pl ) ; 


LMSFF 


,P2 


, Imsf f j>2 ) ; 


LMSFF 


,P3 


, Imsf f _jp3 ) ; 


LMSFF 


,INIT 


, Imsf f_jinit) ; 


DERAND 


, SEED 


, derand__seed) ; 


DERAND 


, BYPASS 


, derand__bypass) 



5.4 dr i ver / sys . h 

20 l#infdef SYS_H 
2#define SYS_H 
3 
4 

5 // @(#)sys.h 1.3 98/03/27 
25 6 

7#include "Callback2wRet . h ,f 
8 

9#define ATTRIBUTE (CLASS, INST, PARM, NAME) \ 
10 obsAttr (make_callback( (Callback2wRet<int .double, int>0 
30 *) , 

&INST, CLASS ; : setAttr) , CLASS : : PARM, #NAME) 

11 
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12 

// 

P . obsAttr (make_callback ( (Callback2wRet<int, double, int> *) 0, 

5 &RX_LMSFF,lmsff : : setAttr) , lmsf f : : FWLENGTH, "lmsf f_fwlen») ; 
14 

15#define DEBUGQ(A) FBQ(A) ;FBQ(db_##A) ;A. asDup (db_##A) ; 
16 

17#include " . . /tx/rnd.h" 
10 18#include " . ./tx/tuplelize.h» 
19#include » . ./tx/diffenc.h» 
20#include " . ./tx/map.h» 
21#include " . . /tx/shape . h" 
22#include "../tx/ad.h" 
15 23#include " . . /chan/f ir.h" 
24#include " . . /chan/noise . h" 
25#include " . ./rx/lmsf f .h" 
26#include " . . /rx/demap. h" 
27#include " . . /rx/detuple.h" 
28#include » . . /rx/derand.h» 
29 

30#endif 

6 Receiver Code 
6.1 rx/demap. h 



1// 



2 // COPYRIGHT 
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3 // ========= 

4 // 

5 // Copyright 1996 IMEC, Leuven, Belgium 

6 // 

5 111 All rights reserved. 
8 // 

g// „ 

10 // Module: 
10 11 // MAP 

12 // 

13 // Purpose: 

14 // Mapping of QAM16/QPSK constellations to symbols 
@ (#) demap .h 

15 1.5 98/03/30 

15 // 

IS II Author: 

17 // Patrick Schaumont/ Radim Cmar 

18// 

20 

20#infdef DEMAP_H 
21#define DEMAP_H 
22 

25 23#include "qlib.h" 
24#ifdef I2C 

25#include "i2c_master.h" 
2 6#include " i2c_slave . h" 
27#endif 
30 28#include ,r macros. h" 

2 9 # i nc 1 ude " t ypede f ine . h " 
30 
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31 


classdemap : public base{ 




32 


public: 




33 






34 


clk& _ck; 


5 


35#ifdef I2C 




36 


i2c_slave _slave; 




37#endif 




38 


PRT(symtype_in) ; 




39 


PRT(diff_mode) ; 


10 


40 


PRT(i_in) ; 




41 


PRT(g_in); 




42 


PRT ( symtype_out ) ; 




43 


PRT ( symbol_out ) ; 




44 


ctlfsm _fsm; 


15 


45 






46 


public : 




47 


enum {DEBUGMODE, REF}; 




48 


enum {QAM16, QPSK} ; 




49 


intdebug_mode ; 


20 


50 


double ref; 




51 






52 


demap(char *name, 




53 


clk& elk, 




54 


_PRT(symtype_in) , 


25 


55 


_PRT(diff_mode) , 




56 


_PRT(i_in) , 




57 


_PRT(q_in) / 




58 


_PRT(symtype_out) , 




59 


_PRT(symbol_out) ) ; 


30 


60 






61 


"demap () ; 




62 


int setAttr (intAttr, doubl 
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63 int decide (dfix constel, df ixest) ; 

64 int run ( ) ; 

65 void define (); 

66 ctlf sm & f sm() ; 
5 67#ifdef I2C 

68 i2c_slave&slave ( ) ; 

69#endif 

70 

71 }; 
10 72 

73#endif 



6 . 2 rx/deraap . cxx 



15 i// 

2 // COPYRIGHT 
4 // 

20 5 // Copyrightl996 IMEC, Leuven, Belgium 

6 // 

7 // Allrights reserved. 

8 // 

Si J 



25 

10 // Module: 

11 // MAP 

12 // 

13 // Purpose: 

30 14 // Mapping of QAM16/QPSKconstellat:ions to symbols 

@(#}demap.cxx 1.8 98/0* 
M/07 
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15 // 

16 // Author: 

17 // Radim Cmar 

18// 

19 
20 

21#include " demap. h" 
22#include "trans. h" 
23 

24 // QAM16 

25 static int vIQMapl6 [4] [4] = { 

26 { 15,14, 10, 11}, 

27 { 13,12, 8, 9}, 

28 { .5 , 4, 0, 2}, 

29 {7,6, 1, 3}}; 
30 

31 // QPSK 

32 static int vIQMap4 [2] [2] = { 

33 { 3,2}, {1, 0}}; 
34 

3 5 demap : : demap ( char * name , 



36 


clk& elk, 


37 


_PRT(symtype_in) , 


38 


_PRT(diffjnode) , 


39 


_PRT(i_in) , 


40 


_PRT(qLjLn) , 


41 


_PRT(symtype_out) , 


42 


_PRT ( symbol_out ) 


43 ) 


: base (name) , 


44 _ck(clk), 




45#ifdef I2C 
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4 6 _slave(strapp(name, "_i2c_host") ) , 
47#endif 





48 


IS_SIG(symtype_in,T_bit) , 




49 


IS_SIG(diff_mode,T_bit) , 


5 


50 


IS_SIG(i_in,T_float) , 




51 


IS_SIG(q_in,T_float) , 




52 


I S_REG ( syratype_out , _ck , T bi t ) , 




53 


IS_REG(symbol_out,_ck, T_float) 




54 { 




10 


55 


IS _IP(symtype_in) ; 




56 


IS _IP(diff_mode) ; 




57 


IS _IP(i_in) ; 




58 


IS IP (a in) • 




59 


I S_OP ( symtype_out ) ; 


15 


60 


IS_OP(symbol_out) ; 




61 






62 


debug_mode= 0 ; 




63 } 






64 





20 65 demap: : "demapO { 
66 } 
67 

68 int demap; :setAttr (intAttr, double v) { 

69 switch (Attr) { 
25 70 case REF: 

71 ref= v; break; 

72 case DEBUGMODE: 

73 debug_mode = (int) v; break; 

74 } 

30 75 return 1; 
76 } 
77 
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78// 

79 

80 int demap : : run ( ) { 
5 81 

82 int thissym; 

83 int ik, qk; 

84 int n_ik,n_qk; 

85 static int ik__at= 1; 
10 86 static int qk_at= 1; 

87 

88 if( (FBID(i_in) .getSizeO <1) | | 

89 (FBID(q_in) .getSizeO <1) | | 

90 (FBID(symtype_in) .get Size () <1) | | 
15 91 (FBID(diff_mode) .getSizeO <1) 

92 ) 

93 return 0; 
94 

95 dfix vi = FBID(i_in) .get() ; 

20 96 dfix vq - FBID(q__in) .get() ; 

97 dfix constel = FBID (symtype_in) .get () ; 

98 dfix diffdec= FBID(dif f_mode) . get Index (0) ; 
99 

100 int indi = decide (constel, vi) ; 

25 101 int indq = decide (constel, vq) ; 
102 

103 if ( constel== QAM16) { 

104 thissym= vIQMapie [indi] [indq] ; 

105 } else{ 

30 106 thissym= vIQMap4 [indi] [indq] ; 

107 } 

108 int thissymO = thissytn; 
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109 
110 

111 if( diffdec== l) { 

112 if(constel == QAM16) { 

5 113 ik = (thissym>> 3) &1; 

114 qk = (thissym>> 2) &l; 

115 

n_ik 

( ( (« (ik A qk) )&(ik A ik_at) ) | ( (ik A qk) &(qk A qk_at) ) ) &l ; 

n_qk 

10 ( ( (•• (ik A qk) ) &(qk A qk_at) ) | ( (ik A qk) &(ik A ik_at) ) ) &l ; 

117 ik_at= ik; 

118 qk_at= qk; 

119 thissym = (n_ik«3 ) + (n_qk« 2) + (thissym & 
3) ; 

15 120 

121 } else { 

1 22 ik = (thissym>> 1) &i ; 

123 qk = (thissym ) & i ; 
124 

n_ik= 

20 ( ( (- (ik A qk) ) &(ik A ik_at) ) | ( (ik A qk) &(qk A qk_at) ) ) &1 ; 

n_qk= 

( ( ( » (ik A qk) ) & (qk A qk_at) ) | ( (ik A qk) & (ik A ik_at) ) ) &1 ; 

126 ik_at= ik; 

127 qk_at= qk; 

25 128 thissym = (n_ik«l ) + (n_qk ) ; 

129 } 

130 } 
131 

132 i f ( debug_mode ) 

30 133 cout<< "_constel:_"«constel 

134 << "_i : _"<<vi 

135 << "_q:_»<<vq 
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136 « "__thissym0 :_" <<thissymO 

137 « "_ik:_"«ik 

138 << "_qk:_"<<qk 

139 << "_n_ik:__ f, <<n_ik 
5 140 << "_n_qk:__"<<n_qk 

141 << "_thissym:_"<<thissym<<endl ; 
142 



143 FBID(syinbol_out} << (thissym) ; 

144 FBID (symtype_out) << (constel); 
10 145 

146 return 1; 

147} 

148 

149 int demap: ; decide (dfix constel, dfix est) { 
15 150 double c = ref/3; 

151 if (constel== QAM16) { 



152 if (est > dfix(2*c)) 

153 return 3; 

154 else if (est > dfix(0)) 
20 155 return 2; 

156 else if (est > dfix(-2*c)) 

157 return 1; 

158 else 

159 return 0; 
25 160 } else{ 

161 if (est > dfix(0. ) ) 

162 return 1; 

163 else 

164 return 0; 
30 165 } 

166} 
167 
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168// 

169 

170 ctlfsm & demap : : f sm ( ) { 
5 171 return _fsn\; 
172} 
173 

174#ifdef I2C 

175 i2c_slave & demap: : slave () { 
10 176 return _slave; 
177} 

178#endif 
179 

180 void demap: : define () { 
15 181 int i; 
182 

183 dfixT_2bit (0,2,0, dfix: :tc) ; 

184 dfixT_cnt(0,3,0,dfix: :ns) ; // symbol counter upto 
4 

20 185 dfixT_symb(0,4,0,dfix: :ns) ; // symbol type 0..15 
186 

187 PORTJTYPE ( i_in, T (T_sample_demap) ) ; / /user type 

188 PORT_TyPE(q_in,T(T_sample_demap) ) ;//user type 

189 PORT_TYPE(synibol_out # T_sytnb) ; 
25 190 

191 FSM(_fsm) ; 

192 INITIAL (rst) ; 

193 STATE (phasel) ; 

194 STATE (phase2 ) ; 
30 195 STATE (phase3 ) ; 

196 

197 SIGCK(constelqam f _ck, T_bit) ; 
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198 SIGCK(diffdecod, _ck, T_bit) ; 

199 SI GCK ( i_i np , _ck , T ( T_sampl e_demap ) ) ; 

200 SIGCK(q_inp,_ck, T (T_sample_demap) ) ; 

201 SIGW(indi, T_2bit) ; 
5 202 SIGWUndq, T_2bit) ; 

203 SIGCK(start_frame,_ck, T_bit) ; 

204 _sigarraymapsl6("maps",16, &_ck, T_symb) 

205 _s igarraymaps4 ( » maps " , 4 , &_ck , T_syrab ) ; 

206 SIGW(symbO, T_symb) ; 
10 207 SIGW(symbl, T_symb) ; 

208 SIGW(ik, T_bit) ; 

209 SIGW(qk, T_bit) ; 

210 SIGW(ik _l,T_bit); 

211 SIGW(qk_l, T_bit); 

15 212 SIGCK(ik_at,_ck, T_bit) ; 

213 SIGCK(qk_at,_ck, Tjbit) ; 

214 SIGW(ak, T_bit) ; 

215 SIGW(bk, T_bit) ; 
216 

20 217#ifdef I2C 

218 for(i = 0; i < 16; i++) 

219 _slave.put (&mapsl6 [i] ) ; 

220 for(i = 0; i < 4; i++) 

221 _slave.put (&maps4 [i] ) ; 
25 222#endif 

223 
224 

225 SFG ( demap_al lways ) ; 

226 GET ( di f f _mode ) ; 

30 227 diffdecod= diff_mode; 
228 
229 
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230 SFG(demap_reset) ; 

231 for(i - 0; i < 16; i++>) 

232 mapsl6[i] = W (T_symb, vIQMaplS [i/4] [i%4] ) ; 

233 for(i = 0; i < 4; i++) 

5 234 maps4[i] = W (T_symb, vIQMap4 [i/2] ti%2] ) ; 
235 

236 setv ( s t art_f rame , 0 ) ; 

237 setv(ik_at, 0) ; 

238 setv(qk_at , 0) ; 
10 239 

240 

241 SFG ( demap__qaml 6 ) ; 

242 double c = ref/3; 

243 indi= (i_inp<= C (i__inp, -2*c) ) c. assign (C(indi, 0) , 
15 244 (i_inp<= C(i_inp / 0.0) ) c. assign (C(indi, 1) , 

245 

(i_inp<=C (i_inp, +2*c) )c. assign (C(indi, 2) ,C(indi,3) ) ) ) ; 
246 

247 indq= (q_inp<= C (q_inp, -2*c) ) c . assign (C (indq, 0) , 
20 248 (q_inp<= C(q_inp, 0.0) ) c . assign (C (indq, 1) , 

249 

(q_inp<=C(q_inp, +2*c) ) cassign (C (indq, 2) ,C(indq,3) ) ) ) ; 

250 

251 

25 symb0= ( (indi==W(T_2bit, 0) ) &(indq==W(T_2bit f 0) ) ) . cassign (mapsl6 [ 
0], 
252 

( (indi==W(T_2bit, 0) ) & (indq==W (T_2bit , D ) ) . cassign (maps 16 [1] , 

253 ( (indi==W(T_2bit / 0))&(indq==W(T_2bit / 2) ) ) . cassign (maps 
30 16 [2], 

254 ( (indi==W(T_2bit, 0) ) & (indq==W (T_2bit , 3 ) ) ) . cassign (maps 
16[3] , 
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255 

( (indi==W(T_2bit,l) )&(indq==W(T_2bit, 0) ) ) . cas sign (maps 16 [4] 
256 

5 ((indi==W(T_2bit / l))&(indq==W(T_2bit,l))) . cassign (mapsl6 [5] 
257 

( (indi==W(T_2bit, 1) ) &(indq==W(T_2bit,2) ) ) . cassign (maps 16 [6] 
10 258 

( (indi==W(T_2bit,l) )&(indq==W(T_2bit,3) ) ) . cassign (mapsl6 [7] 

r 

259 

( (indi==W(T_2bit,2) ) &(indq==W(T_2bit, 0) ) ) . cassign (mapsl6 [8] 
15 , 

260 

( (indi==W(T_2bit, 2) ) &(indq==W(T_2bit, 1) ) ) . cassign (mapsl 6 [9] 
261 

20 ((indi==W(T_2bit,2))&(indq==W(T_2bit,2))) . cassign (mapsl6 [10 
], 
262 

( (indi==W(T_2bit,2) ) &(indq==W(T_2bit, 3) ) ) . cassign (mapsl 6 [11 
] , 

25 263 

( (indi==W(T_2bit,3) ) &(indq==W(T_2bit,0) ) ) . cassign (mapsl6 [12 
3 , 

264 

( (indi==W (T_2bit , 3 ) ) & (indq==W (T_2bit , 1 ) ) ) . cassign (mapsl6 [13 
30 ] , 
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265 

((indi==W(T_2bit,3))&(indq==W(T_2bit,2))) . cassign (maps 16 [14 

] , 

266 

5 mapsl6 [15] 



267 


) 


268 




269 


ik 


270 


qk 


10 271 




272 


ik = 


273 


qk = 


274 


ak 


qk_l) ) 





15 275 bk = (("(ik * qk)) & (qk A qk_l) ) | ( (ik A qk) & (ik* 
ik_D ) ; 

276 ik_at=ik; 

277 qk_at=qk; 
278 

20 279 symbl = (symbO &W (T_symb,3) ) | 

280 ( (cast (T_symb,ak) <<W (T_sytnb, 3) ) &W 
(T_symb,8) ) | 

281 ( (cast (T_symb,bk) <<W (T_symb f 2) ) &W 
(T_symb,4) ) ; 

25 282 symbol_out= (diffdecod) .cassign (symbl, symbO) ; 
283 
284 

285 S FG ( demap_qpsk ) ; 

286 indi = (i_inp< C(i_inp,0) 
30 ) c. assign (C (indi , 0) ,C(indi,l) ) ; 

287 indq= (q_inp< C(q_inp,0) 
) c. assign (C (indq, 0) ,C(indq,l) ) ; 
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288 

289 symb0= { (indi==W(T_2bit, 0) ) & (indq==W (T_2bit , 0) ) ) 
. cassign (maps4 [0] , 

290 

5 ( (indi==*W(T_2bit, 0) ) & (indq==W (T_2bit , 1) ) ) . cassign (maps4 [1] , 
291 

( (indi==W(T_2bit, 1) ) & (indq==W (T_2bit , 0) ) ) . cassign (maps4 [2] , 
292 

maps4 [3] 
10 293 ) ) ) ; 
294 

295 ik_l= (start_frame) . cassign (W (T_bit , 0) ,ik_at) ; 

296 qk_l= (start_frame) . cassign (W (T_bit , 0) ,qk_at) ; 
297 

15 298 ik= cast (Tjbit, symb0» W(T_bit,l) ) ; 

299 qk = cast (T_bit , symbO) ; 

300 ak = (("(ik * qk)) & (ik* ik_l) ) | ((ik A qk) & (qk A 
qk„D) ; 

301 bk = ({"(ik A qk)) & (qk A qkJL) ) | ( (ik" qk) & (ik" 
20 ik_l)); 

302 ik_at=ik; 

303 qk_at=qk; 
304 

305 symbl = ( (cast (T_symb, ak) <<W (T_symb, 1) ) &W 
25 (Tj3ymb,2) ) | 

306 (cast (T_symb,bk) &W(T_symb,l) ) ; 

307 symbol_out= (diffdecod) . cassign (symbl , syrnbO) ; 
308 

309 

30 310 SFG (demap_in ) ; 

311 GET(i_in); 

312 GET(q_in) ; 
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313 GET ( symt ype_in ) ; 

314 i_inp=i_in; 

315 q_inp=q_in; 

316 constelqam= "symtype_in; 
5 317 symtype_out= symtype_in; 

318 

319 SFG (demap_out ) ; 

320 PUT ( symbol_ou t ) ; 

321 PUT ( symtype_out ) ; 
10 322 

323 

324 // 

325 

326 DEFAULTDO(demap_allways) ; 
15 327 AT (rst) ALLWAYS 

328 DO (demap_reset) 

329 GOTO(phasel) ; 
330 

331 AT (phase 1) ALLWAYS 
20 332 DO(demap_in) 
333 GOTO(phase2) ; 
334 

335 AT (phase2)ON (_cnd (const elqam) ) 

336 DO ( demap_qaml 6 ) 
25 337 GOTO(phase3) ; 

338 

339 AT (phase2)ON ( !_cnd (constelqam) ) 

340 DO (demap__qpsk) 

341 GOTO(phase3) ; 
30 342 

343 AT (phase 3 ) ALLWAYS 

344 DO ( demap_ou t ) 
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345 GOTO ( phase 1) ; 

346 

347 

348#ifdef I2C 
5 349 _slave. attach (_fsm, * state _phase2, _ck) ; 
350#endif 
351 

352 _f sm. set info (verbose) ; 

353 ofstream FO ("deraap_trans0.dot") ; 
10 354 F0«_fsm; 

355 FO. close (); 
356 

357 transform TRANSF (_f sm) ; 

358 TRANSF. fsm_handshakel (_ck) ; 
15 359 

360 ofstream F( n demap_trans.dot") ; 

361 F << _fsm; 

362 F . close () ; 

363 _fsm. setinfo (silent ) ; 
20 364 

365 FSMEXP(typeName( ) ) ; 

366} 

367 

25 6.3 rx/derand.h 

1// 

2 // COPYRIGHT 

30 3 // ========= 

4 // 

5 // Copyright 1996 IMEC, Leuven, Belgium 
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6 // 

7 // All rights reserved. . 

8 // 

9// 

5 

10 // Module: 

11 // PRBS 

12 // 

13 // Purpose: 

10 14 // De-randomises data usinga 6-bit or 15-bit 

15 // Pseudo Random Binary Sequence. @ (#) derand. hi . 2 
98/03/30 

16 // 

17 // Author: 
15 18 // r cmar 

19 // 

20// 

21 

20 22#include "qlib.h" 

23#ifdef 12C 

24#include "i2c_master .h" 

25#include "i2c_slave.h" 

26#endif 
25 27#include "macros. h" 

28#include "typedef ine.h" 

29 

30#infdef DERANDJK 
31#define DERAND_H 
30 32 

33 class derand : public base 

34 { 
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35 







36 


public: 






37 


elk & _ck; 






38#ifdef I2C 




5 


39 


i2c_slave __slave; 






40#endif 






41 


PRT(byte_in) ; 






42 


PRT(syncro) ; 






43 


PRT(byte_out) ; 




10 


44 


PRT(sync__out) ; 






45 


ctlf sm_f sm; 






46 








47 


enum {SEED, BYPASS , DEBUGMODE 






48 




ui 


15 


49 


de rand (char *name, 


yi 




50 


clk& elk, 


yj 




51 


__PRT(byte_in) , 






52 


_PRT(syncro) , 






53 


_PRT{byte_out) , 


HF 


20 


54 


J?RT(sync_out) 






55 


) ; 






56 








57 


setAttr(int Attr, double v=0) 






58 


int run ( ) ; 




25 


59 


void define () ; 






60 


ctlfsm & fsm() ; 



61#ifdef I2C 
62 i2c_slave &slave(); 
63#endif 
30 64 

65 public: 

66 int bypass; 
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67 int seed; 

68 int debug; 

69 }; 
70 

5 71#endif 

6 . 4 rx/ derand . cxx 

i// 

10 

2 // COPYRIGHT 

3 // ========= 

4 // 

5 // Copyright 1996 IMEC, Leuven, Belgium 
15 6 // 

7 // Allrights reserved. 

8 // 

9// 

20 10 // Module: 

11 // PRBS 

12 // 

13 // Purpose : 

14 // De-randomises data usinga 6-bit or 15-bit 

25 15 // Pseudo Random Binary Sequence. @(#) derand .cxxl . 8 

98/04/07 

16 // 

17 // Authors: 

18 // r cmar 
30 19 // 

20// 
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21 

22#include "derand.h" 
23#include "trans.h" 
24 

25 derand: :derand(char *name, 

26 clk& elk, 

27 JPRT (byte_in) , 

2 8 _PRT(syncro) , 
29 _PRT(byte__out) , 

3 0 _PRT ( sync_out ) 

31 ) : base (name) , 

32 _ck(clk), 
33#ifdef I2C 

34 __slave (strapp (name f "_i2c_host " ) ) , 
35#endif 

36 IS_SIG(byte__in,T__8bit) , 

37 ISJSIG(syncro,T_bit) , 

38 IS_REG(byte_out,clk,T_8bit) , 
3 9 IS_REG (sync_out , elk, T_bit ) 

40 { 

41 IS_IP(byte_in) ; 

42 I S_I P ( syncro ) ; 

43 I S_0P (by t e_out ) ; 

44 ISJDP (sync_out) ; 
45 

46 bypass= 0; 

47 seed= 0x3 f; 

48 debug = 0; 

49 } 
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52 

53 int derand: :setAttr (int Attr, double v) 

54 { 

55 switch (Attr) 

56 { 

57 case SEED: 

58 seed= (int)v; break; 

59 case BYPASS: 

60 bypass = (int)v; break; 

61 case DEBUGMODE: 

62 debug = (int)v; break; 

63 } 

64 return 1; 

65 } 
66 

67// 

68 

69 int derand :: run ( ) 

70 { 

71 static unsigned shiftreg= 0; 
72 

73 #define BiT(k, n) ( (k» (n-1) ) & 1) 

74 #define MaSK<k, n) (k & (n+1) ) -1) ) 
75 

76 if ( (FBID(byte_in) .getSizeO 
<1) | |F(BID(syncro) .getSize () <1) ) 

77 return 0; 
78 

79 dfix data _in=FBID (byte _in) .get() ; 

80 dfix sync = FBID(syncro) .get {) ; 
81 



257 

82 unsigned data = unsigned (data_in. Val { ) ) ; 
83 

84 if (bypass == 0) { 
85 

5 86 if (sync == dfix(l)) 
87 shiftreg= seed; 

88 

89 unsigned mask = 0; 

90 int xbit; 

10 91 for (int k=0; k<8; k++) { 

92 xbit = BiT(shiftreg,5) x BiT (shif treg, 6 ); 

93 shiftreg= MaSK(xbit | (shiftreg<< 1) ,6); 

94 mask = (mask<< 1) (xbit; 

95 } 
15 96 

97 data x = mask; 

98 } 
99 

100 FBID(byte_out) «dfix ( (double) (data) ) ; 
20 101 return 1; 
102} 
103 

104// 



25 105 

106 ctlfsm & derand: : f sm() { 

107 return _ fsm; 
108} 

109 

30 110#ifdef I2C 

111 i2c_ slave & derand: : slave ( ) { 

112 return slave; 
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113} 

114#endif 
115 

116 void derand: : define () { 
5 117 

118 dfix T_byte (0,8,0, dfix: ;ns) ; 

119 dfix T_sreg(0, 16,0, dfix: :ns) ; 

120 dfix T_num(0,4, 0,dfix: :ns) ; // to express constants 
0. .15 

10 121 

122 PORT_TYPE(byte_in,TJ>yte) ; // 8 bits 

123 PORTJTYPE(byte_out,T_byte) ; // 8 bits 
124 

125 SIGW(mask, T_byte) ; // 8 bits 

15 126 SIGCK(shiftreg, _ck, T_sreg) ; // 16 bits 

127 SIGCK(seed, _ck, T_sreg) ; // 16 bits 

128 SIGCK(bypass / _ck, TJbit) ; 

129 __sigarray xbits("xbits",8 + l, T_bit) ; 

130 _sigarray shif ts ( w shif ts" , 8+1, T_sreg) ; 
20 131 _sigarr ay masks ("masks" , 8+1, TJbyte) ; 

132 

133#ifdef I2C 

134 _slave.put (&seed) ; 

135 _slave.put (fcbypass) ; 
25 136#endif 

137 

138 FSM( __fsm) ; 
13 9 INITIAL (rst) ; 
140 STATE (phasel) ; 
30 141 STATE (phase2) ; 
142 

143 SFG( rnd__reset) ; 
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144 byte_out=W (T_byte, 0) ; 

145 seed = W (T_sreg, 0x3f ) ; 

146 sync_out=W(T_bit, 0) ; 

147 bypass = W (T_bit , 0) ; 
5 148 shiftreg= W (Tjsreg, 0) ; 

149 
150 

151 SFG(rnd_read) ; 

152 GET (byte_J.il) ; 
10 153 GET(syncro); 

154 
155 

156 #define BIT(s / k) cast (T_bit , s>> W (T__num, k-1) ) 

157 #define MASK(s,n) (s& W (T_sreg, (1« (n+1) ) -1) ) 
15 158 

159 SFG(rnd_jprbs6) ; 
160 

161 shifts [0] = (syncro~w 

(T_bit, 1) ) . cassign(seed, shiftreg) ; 
20 162 

163 masks[0] =W(T_byte, 0) ; 

164 for(int k=0; k<8; k++) { 

165 xbits[k] « BIT(shifkt]s,5> "BIT (shifts [k] , 6) ; 

166 shifts [k+l]=MASK( (cast (T_sreg, xbits [k] ) &W(T_sreg, 1) ) | 
25 shifts [k]W<<(T_num,l) ) ,6) ; 

167 masks tk+l] = (masks [k]«W(T_byte / l) ) | 
(cast <T_byte,xbits [k] ) &W (TJbyte, 1) ) ; 

168 } 

169 shiftreg= shifts [8] ; 
30 170 mask = masks [8] ; 

171 

172 byte_out- (bypass) . cassign (byte_in, byte_in"mask) ; 
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173 sync_out =W (T_bi t , 1 ) ; 

174 

175 

176 SFG( rnd _write) ; 
5 177 PUT(byte_out) ; 

178 PUT ( sync_out ) ; 

179 sync_out =W ( T_bi t , 0 ) ; 
180 

181 

10 182// 

183 

184 AT ( r s t ) ALLWAYS 

185 DO( rnd_reset) 
15 186 GOTO (phasel) ; 

187 

188 AT (phasel) ALLWAYS //state « cond «sfg «sfg « 
state 

189 DO(rnd_read) //phasel«allways«rnd_read «rnd_j>rb6<< 
20 phase2 

190 DO(rndjprbs6) 

191 GOTO(phase2) ; 
192 

193 AT (phase2 ) ALLWAYS 
25 194 DO rnd_write) 
195 GOTO (phasel) ; 
196 

197#ifdef I2C 

198 _slave. attach (_fsm, *state_phase2,_ck) ; 
30 199#endif 
200 

201 _fsm. setinfo (verbose) ; 
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202 ofstreara F0 ("derand_trans0.dot") ; 

203 F0« _fsm; 

204 F0. close (); 
205 

5 206 transform TRANSF (_f sm) ; 
207 TRANSF. fsm_handshakel (_ck) ; 
208 

209 of st ream F("derand_trans.dot") ; 

210 F « _fsm; 
10 211 F .close () ; 

212 _fsm. setinfo (silent) ; 
213 

214 FSMEXP (typeName ( ) ) ; 
215} 
15 216 

6 . 5 rx/detuple . h 

1 // 

20 

2 // COPYRIGHT 

3 // ========= 

4 // 

5 // Copyright 1996 IMEC, Leuven, Belgium 
25 6 // 

7 // All rights reserved. 

8 // 

9 // 



30 10 // Module: 

11 // TUPLE 

12 // 
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13 // Purpose: 

14 //header detection + detuplelization @ (#) detuple . h 1-.2 
8/03/30 

15 // 

5 16 // Author: 

17 // Radim Cmar 

18// 

19 

10 20#infdef DETUPLE_H 
21#define DETUPLE_H 
22 

23#include "qlib.h" 
24#include "macros .h" 
15 25#include " typedef ine . h" 
26 

27 class detuple : public base{ 

28 public: 
29 

20 30 clk& _ck; 

31 PRT (symbol); 

32 PRT(symtype) ; 

33 PRT (byte) ; 

34 PRT(syncro); 
25 35 ctlfsm_fsm; 

36 

37 int debug_jnode ; 
38 

39 public: 
30 40 enum {DEBUGMODE}; 

41 enum { QAM1 6 , QPSK} ; 
42 



263 

43 de tuple (char *name, 

44 clk& elk, 

45 _PRT (symbol) , 
4 6 _PRT ( symtype ) , 

47 _PRT(byte), 

48 _PRT(syncro) 

49 ) ; 
50 

51 n detuple(); 

52 int setAttr (intAttr, doublev=0) ; 

53 int run () ; 

54 void define (); 

55 ctlfsm & fsm() ; 

56 }; 
57 

58#endif 

6 . 6 rx/detuple . exx 
1// 

2 // COPYRIGHT 

3 // ========= 

4 // 

5 // Copyright 1996 IMEC, Leuven, Belgium 

6 // 

111 All rights reserved. 
8 // 

9// 

10 // Module: 

11 // TUPLE 
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12 // 

13 // Purpose : . 

14//header detection + detuplelization @(#)detuple.cxxl.3 
98/04/07 
5 15 // 

16 // Author: 

17 // Radim Cmar 

18// 

19 
20 

21#include "detuple.h" 
22#include "trans. h" 
23 

24 detuple: rdetuple (char *name,clk& elk, 

25 _prt ( symbol ) , 
2 6 _PRT ( symtype ) , 

27 _PRT(byte), 

28 _PRT(syncro) 

29 ) : base (name), 

30 _ck(clk), 

31 I S_SIG( symbol, T_4bit ) , 

32 ISJSIG (symtype, T_bit) , 

33 IS_REG(byte,_ck, T_8bit) , 

34 IS_REG(syncro,_ck, T_bit) 

35 { 

36 IS_IP (symbol) , 

37 IS_IP (symtype) ; 

38 IS_0P(byte); 

39 IS_OP(syncro) ,- 
40 

41 debug_mode= 0; 



42 } 

43 

44 

45 detuple : : "detuple ( ) { 
5 46 } 
47 
48 

49 int detuple: : setAttr (intAttr, double v) { 

50 switch (Attr) { 
10 51 case DEBUGMODE: 

52 debug_mode = (int)v; break; 

53 } 

54 return 1; 

55 } 
15 56 

57 

58 static int QAM16_sync[] « {0,0,5,5,0,0,5,5 }; 

59 static int QPSK_sync [ ] 
0,0,0,0,1,1,1,1,0,0,0,0,1,1,1,1}; 

20 60 static int QAM16_headlen= 8 ; 
61 static int QPSK_headlen= 16; 
62 
63 

64 int detuple :r:un() { 
25 65 int i; 
66 

67 static int tuplcnt= 0; 

68 static int corrcnt= 0; 

69 static int sync = 0; 

30 70 static dfix oldstype= 0; 

71 static dfix corrarr[16] ; 

72 static dfix tuplarr[4] ; 
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73 

74 int headlen; 

75 int symbcount; 

76 dfix tuple; 
5 77 

7 8 if ( (FBID (symbol) .getSize () 
<1) | | (FBID(symtype) .getSizef) <1) ) 

79 return 0; 
80 

10 81 dfix symb = FBID (symbol) .get () ; 
82 dfix stype = FBID (symtype) .get () ; 
83 

84 if (stype == QAM16) { //length of header depends on 

QAM16/QPSK constel 
15 85 headlen= QAM16_headlen; 

86 symbcount = 2; 

87 } 

88 else{ 

89 headlen= QPSK_headlen; 
20 90 symbcount = 4; 

91 } 
92 

93 if ( corrcnt== headlen) { 
94 

25 95 int equal = 1; // search for 

header 

96 for(i = 0; i < headlen; i++) { 

97 if (stype === QAM16) 

98 equal = equal &( corrarrfi] ==QAM16_sync [headlen- 
30 1-i]); 

99 else 
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100 equal = equal &( corrarrfi] ==QPSK_sync [headlen- 

101 } 
102 

5 103 if (equal) { // header 

appeared 
104 

105 if(stype == QAM16) //flush tuplarr (evenif not 

complete) 

10 106 tuple = tuplarr [0] + tuplarr [1] * 16 ; 

107 else 
108 

tuple=tuplarr [0] +tuplarr [1] *4+ tuplarr [2] *16+tuplarr [3] *64 ; 
109 FBID(byte) « (tuple); 

15 110 FBID(syncro) « (sync); 

111 

112 sync =1; // indicates start of 
frame 

113 cor rent = 1; 
20 114 tuplcnt= 0; 

115 } 

116 else{ * // normal processing 
117 

118 if(tuplcnt=« symbcount) { 

25 119 if (stype== QAM16) 

120 tuple = tuplarr [0] +tuplarr [1] *16; 

121 else 
122 

tuple=tuplarr [0] +tuplarr [1] *4+tuplarr [2] *16+tuplarr [3] *64 ; 
30 123 FBID(byte) « (tuple); 

124 FBID(syncro) « (sync) ; 

125 
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126 



sync = 0; 



127 



tuplcnt = 1 ; 



128 



129 



else 



5 130 



tuplcnt++; 



131 } 

132 } 

133 else 

134 corrcnt++; 
135 

136 for(i = symbcount-1; i> 0 ;i--) 

137 tuplarr[i] =tuplarr [i-1] ; 

138 tuplarr[0] =corrarr [headlen-1] ; //shift out the oldest 
symbol 

139 

140 for(i = headlen-1; i> 0 ;i--) // shift in new symbol 

141 corrarrti] =corrarr [i-1] ; 

142 corrarr[0] =symb; 
143 

144 if ( oldstype!= stype) { // QPSK/QAM16 change 

145 corrcnt= 0; 

146 tuplcnt = 0; 

147 } 

148 oldstype= stype; 
149 

150 return 1; 



151} 



152 



153 



154// 



155 
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156 ctlfsm & detuple: : fsm{) { 

157 return _fsrn; 
158} 

159 

5 160 void detuple:d:ef ine () { 
161 int i; 
162 

163 int headlen_qam = 8; 

164 int headlen _qpsk= 16; 

165 int symbcount_qam = 2; 

166 int symbcount_qpsk= 4; 

167 #define max(a,b) ( (a> b) ?a : b) 
168 

169 dfix T_cnt (0,5,0,dfix: :ns) ; // symbol counter 
upto 32 

170 dfix T_symb(0,4, 0,dfix: :ns). ; // symbol type 0..15 

171 dfix T_tuple (0, 8, 0,dfix:n:s) ; 
172 

173 FSM( __fsm) ; 

174 INITIAL (rst) ; 

175 STATE (phasel) ; 

176 STATE (phase2) ; 

177 STATE (phase3) ; 

178 STATE (phase4 ) ; 
179 

180 SIGCK(qamtype, _ck, T_bit) ; 

181 SIGCK(old__qamtype, _ck, Tjbit) ; 

182 SIGCK(symbol _reg,_ck, T_symb) ; 
183 

184 SIGCK (iniphase, _ck, T_bit) ; 

185 SIGCK (correlated, _ck, TJbit) ; 

186 SIGCK(tuple__ready,__ck, T_bit) ; 
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187 

188 SIGCK(corrcnt, _ck, T_cnt) ; 

189 SIGCK(tuplcnt, __ck, T_cnt) ; 
190 

5 191 SIGCK(byte, _ck, T_tuple) ; 

192 SIGW(tuple_qam, T_tuple) ; 

193 SIGW(tuple_qpsk, T_tuple) ; 
194 

195 _sigarray tuplarr ( " tarr" , max (symbcount_qam, 
10 symbcount__qpsk) , 

&_ck,T_symb) ; 

196 __sigarray corrarr ( "carr " , max (headlen__qam, 
headlen_qpsk) , 

&_ck, T_symb) ; 

15 197 __sigarray ref ( "ref" , max (headlen_qam, headlen 

__qpsk) T, _symb) ; 

198 __sigarray equal ( "equal" , max (headlen_qam, 

headlen_qpsk) , 

T_bit) ; 

20 199 

200 // 



201 

202 SFG( tupler_reset) ; 
25 203 setv(corrcnt, 0) ; 

204 setv(tuplcnt, 0) ; 

205 setv(old_qamtype, 1) ; 

206 setv (syncro, 0) ; 
207 

30 208 SFG( tuplerjread) ; 

209 GET (symbol); 

210 GET ( symt ype ) ; 
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| ( ( "qamt ype ) & ( corr cnt ! = 



211 symbol_reg=symbol ; 

212 qamtype = "symtype; 
213 

214 

5 215 SFG( tupler_test) ; 

216 iniphase= ((qamtype) & (corrcnt!= 

W(T_cnt,headlen_qam) ) ) 
• 217 

W(T_cnt,headlen_qpsk) ) ) ; 
10 218 
219 

tuple_ready= (qamtype) . cassign (tuplcnt==W (T_cnt , symbcount_qa 

m) , 

220 

15 tuplcnt==W (T_cnt , symbcount_qpsk) ) ; 
221 
222 

223 SFG( tupler _corr) ; 

224 for(i= 0; i < max (headlen_qam, headlen_qpsk) ; i++) { 

20 225 int iqam = (headlen_qam-l-i< 0) ? 0 : headlen_qam- 
1-i; 

226 int iqpsk = headlen _qpsk-l-i; 

227 ref[i] 
(qamtype) .cassign (W(T_symb,QAM16_sync [iqam] ) , 

25 228 W(T_symb, QPSK_sync [iqpsk] 

) ) ; 

229 if(i == o) 

230 equal [i] = (corrarr[i] ==ref[i] ) • 

231 else 

30 232 equal [i] = equal [i-1] & (corrarr[i] ==ref[i] ) ; 

233 } 

234 correlated= (qamtype) . cassign (equal [headlen_qam- 
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1] , equal [headlen_qpsk-l] ) ; 

235 
236 
237 

5 238 SFG(tupler_compose) ; 

239 tuple_qam= (cast (T_tuple, tuplarr [0] ) &W(T_tuple, 15) 
) 

240 I ( (cast (T_tuple, tuplarr [1] )W&(T_tuple, 15) ) 
«W(T_cnt,4) ) ; 

10 241 

242 tuple_qpsk= (cast {T_tuple, tuplarr [0] & W(T_tuple, 3) ) 

243 I ( (cast (T_tuple, tuplarr [1] )& W (TJiuple, 3) ) 
<<W(T_cnt,2) ) 

244 I ( (cast (T_tuple, tuplarr [2] )& W(T_tuple, 3) ) 
15 <<W(T_cnt,4) ) 

245 I ( (cast (T_tuple, tuplarr [3] )& W (T_tuple, 3) ) 
<<W(T_cnt,6) ) ; 

246 

247 byte= (qamtype) .cassign (tuple_qam, tuple_qpsk) ; 
20 248 

249 tuplcnt= (correlated) .cassign(W(T_cnt, 0-1) , 

250 (tuple_ready) .cassign(W(T_cnt,l-l) , 

251 tuplcnt) ) ; 
252 

25 253 corrcnt= (correlated) .cassign(W(T_cnt, 1-1), 
254 corrcnt) ; 

255 
256 

257 SFG(tupler_out) ; 
30 258 PUT (byte); 

259 PUT(syncro) ; 

260 syncro= correlated; 
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261 
262 

263 SFG(tupler__shiftin ); 

264 for(i = 1; i < max (symbcount_qam, symbcount_qpsk) 
5 ;i++) 

265 tuplarr[i] =tuplarr [i-1] ; 

266 tuplarr [0] = (qamtype) .cassign (corrarr [headlen_qam- 
1 ] , corrarr [headlen_qpsk- 1 ] ) ; 

267 

10 268 for(i = max (headlen_qam, headlen_qpsk) -1 ; i> 0 ;i--) 

269 corrarr [i] =corrarr [i-1] ; 

270 corrarr [ 0 ] =symbol_reg ; 
271 

272 
15 273 

274 SFG( tupler_f inish_qam) ; 

275 corrcnt= (old_qamtype != qamtype) . cassign (W (T_cnt,0), 

276 (corrcnt== W 
(T_cnt ,headlen__qam) ) .cassign (corrcnt, 

20 277 corrcnt* W (T_cnt,l) ) ) ; 

278 tuplcnt= (old_qamtypel= qamtype) .cassign (W (T_cnt,0), 

279 (correlated) . cassign (W (T_cnt , 0) , 

280 (cor rent !=W 
(T_cnt f headlen__qam) ) . cassign (tuplcnt, 

25 281 

(tuplcnt==W(T__cnt,symbcount_qam) ) . cassign (W (T_cnt , 1) , 

282 tuplcnt + W (T_cnt,l) ) ) ) ) ; 

283 old_qamtype= qamtype ; 
284 

30 285 SFG( tupler_f inishjjpsk) ; 

286 corrcnt= (old_qamtype i = qamtype) .cassign (W (T cnt,0), 
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287 ( cor rent = = W ( T_cnt , headl en 
__qpsk) ) . cassign (corrcnt, 

288 corrcnt* W (T_cnt,l) ) ) ; 

289 tuplcnt= (old_qamtypel= qamtype) .cassign (W (T__cnt,0), 
5 290 (correlated) . cassign (W (T_cnt , 0) , 

291 (cor rent !=W 

(T_cnt,headlen_qpsk) ) . cassign (tuplcnt , 

292 

(tuplcnt==W (T_cnt , symbcount_qpsk) ) . cassign (W (T_cnt , 1) , 
10 293 tuplcnt* W (T_cnt,l) ) ) ) ) ; 

294 old_qamtype= qamtype; 
295 

296 // -- 

15 297 

2 98 AT (rst) ALLWAYS 

2 99 DO(tupler_reset) 
300 GOTO(phasel) ; 
301 

20 302 AT (phasel) ALLWAYS 
303 DO(tupler_read) 

3 04 DO( tupler_test) 

305 DO( tupler_corr) 

306 GOTO(phase2) ; 
25 307 

308 AT (phase2)ON (_cnd (iniphase) | ((_ lend (correlated) && 

I_cnd(tuple_ready) ) ) 
3 09 GOTO (phase4 ) ; 
310 

30 311 AT (phase2)ON ( !_cnd (iniphase) _cnd (correlated) ) 

312 DO (tupler_compose) 

313 GOTO(phase3) ; 
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314 

315 AT (phase2)0N ( i_cnd Uniphase) && __cnd (tuple_ready) && 
!_cnd (correlated) ) 

316 DO (tupler_compose) 
5 317 G0T0(phase3) ; 

318 

319 AT (phase3 ) ALLWAYS 

320 DO(tupler_out) 

321 GOTO(phase4) ; 
10 322 

323 AT (phase4)ON (_cnd (qamtype) ) 

324 DO (tupler_shif tin) 

325 DO (tupler__f inish__qam) 

326 GOTO (phase 1) ; 
15 327 

328 AT (phase4)ON ( !_cnd (qamtype) ) 

329 DO (tupler_shif tin) 

330 DO (tupler_f inish__qpsk) 

331 GOTO(phasel) ; 
20 332 

333 _f sm.setinfo (verbose) ; 

334 ofstream F0 ("detuple_trans0 .dot ") ; 

335 F0« _fsm; 

336 F0. close () ; 
25 337 

338 transform TRANSF (_f sm) ; 

339 TRANSF. f sm_handshakel (_ck) ; 
340 

341 ofstream F ( "detuple_trans .dot " ) ; 
30 342 F « _fsm; 

343 F .close () ; 

344 _fsm. setinfo (silent) ; 
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345 

346 FSMEXP(typeName( ) ) ; . 
347 
348} 
5 349 

6.7 rx/lmsff .h 

1 

10 2 // Author rRadim Cmar 

3 // Purpose: ADAPTIVE EQUALIZER (LMS) @(#)lmsff.h 1.4 
98/03/30 
4 

5#infdef LMS_H 
15 6#define LMS_H 
7 

8#include "qlib.h" 

9#ifdef I2C 
10#include ,f i2c_master.h" 
20 ll#include "i2c_slave . h» 
12#endif 

13#include "macros. h" 
14#include "typedef ine .h" 
15 

25 16 class Imsff : public base{ 
17 

18 public: 

19 elk & _ck; 
20#ifdef I2C 

30 21 i2c_slave _slave; 
22#endif 

23 PRT (consteljnode) ; 
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24 PRT (in_sample) ; 

25 PRT(out_i) ; 

26 PRT (out__q) ; 

2 7 PRT ( symt ype ) ; 
5 28 ctlfsm _fsm; 

29 

3 0 int constel _type; //QAMlGor QPSK 

31 intSPS; // samples per symbol 

32 intCPS; // cycles per sample 
10 33 intNF; // forward taps 

34 intSTEP; // step adaptation constant 

35 double p0,pl,p2,p3; 

36 double ref; 
37 

15 38 public: 

39 enum { SPS_PAR, FWLENGTH, STEP_PAR, INIT, 
P0,P1,P2,P3,REF }; 

40 enum { QAM16, QPSK }; 
41 

20 42 lmsff (char *name, 

43 elk & elk, 

44 _PRT (constel_mode) , 

45 _PRT(in_sample) , 

46 __PRT(out__i) , 
25 47 _PRT(out__q) , 

4 8 _PRT ( symt ype ) 
49 ) ; 

50 

51 int setAttr(int Attr, double v=0) ; 

30 52 int run() ; 

53 void define (); 

54 ctlfsm &fsm() ; 
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55#ifdef I2C 

56 i2c_slave &slave(); 

57#endif 

58 

5 59 //untimed mode 

GO dfix decide (dfix constel, dfix est); 

61 dfix coefi [111] ; 

62 dfix coefq [111] ; 

63 dfix sample [111] ; 
10 64 

65 }; 
66 

67#endif 
.15 6.8 rx/lmsff .cxx 

1 

2 // Author :Radim Cmar 

3 // Purpose: ADAPTIVE EQUALIZER (LMS) @ (#) lmsf f . cxx 1.18 
20 98/04/07 

4 

5#include "lmsf f . h" 
6#include <math.h> 
7#include " trans. h" 
25 8 

9 lmsf f : :lmsff (char *name, 



10 elk & elk, 

11 _PRT(constel_mode) , 

12 _PRT (in_sample) , 
30 13 _PRT(out_i), 

14 _PRT(out_q) / 

1 5 _PRT ( symtype ) 
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16 ) : base (name) , 

17 _ck(clk), 
18#ifdef I2C 

19 _slave (strapp (name, "_i2c_host " ) ) , 
5 20#endif 

21 IS_SIG (consteljnode, T__bit) , 

22 IS_SIG (in_sample, T_float) , 

23 IS_REG (out_i,_ck, T_f loat) , 

24 ISJREG (outjj, _ck, T_float) , 
10 25 ISJREG (symtype, _ ck, T_bit) 

26 { 

27 IS__IP(constel_mode) ; 

28 IS_IP(in_sample) ; 

29 IS_0P(out_i) ; 
15 30 IS_OP(out_q) ; 

31 I S_ OP ( symtype ) ; 
32 

33 SPS = 4; 

34 STEP = 4; 
20 35 NF = 8; 

36 ref= 3.0; 

37 } 
38 

39 int lmsf f ; rsetAttr (int Attr, double v) { 

25 40 switch (Attr) { 

41 case SPS_PAR : // parametrizable only for untimed 
model 

42 SPS = (int) v; 

43 break; 

30 44 case FWLENGTH : 

45 NF « (int) v; 

46 break; 
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47 


case STEP_PAR : 




48 


STEP = (int) v; 




49 


break; 




50 


case PO : 


5 


51 


pO = v; 




52 


break ; 




53 


case PI : 




54 


pi = v; 




55 


break ; 


10 


56 


case P2 : 




57 


p2 = v; 




58 


break ; 




59 


case P3 : 




60 


p3 - v; 


15 


61 


break ; 




62 


case REF: 




63 


ref = v; 




64 


break; 




65 


case INIT : 


20 


66 


cerr« " * * *_INFO : __LMSFF__equali zer 




67 


for (int i=0; i < NF; i++) { 




68 


sample[i] - dfix(0) ; 




69 


coef i [i] = dfix(0) ; 




70 


coefqfi] = dfix(0) ; 


25 


71 


} 




72 


int offs = (NF-4)/2; 




73 


coefq[offs+ 0] = pO; 




74 


coef i [of f s+ 1] = pi; 




75 


coefq[offs+ 2] = p2; 


30 


76 


coef i [of f s+ 3] = p3; 




77 


break ; 




78 


} 
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79 return 1; 

80 } 
81 

82 // 

5 - 
83 

84 int lmsf f : :run() { 

85 int i; 

86 dfix acci, accq, equali, equalq,esti, estq, erri,errq; 
10 87 

88 if ( (FBID(in_sample) .getSize () <SPS) | | 
(FBID(constel_mode) .getSize () 1<) ) 

89 return 0; 
90 

15 91 dfix constel= FBID (constel_mode) .getlndex(O) ; 
92 dfix step = 1 . O/pow (2 . 0 , STEP) ; 
93 

94 // ff filtering 

95 acci= 0; 
20 96 accq= 0; 

97 for(i = 0; i < NF ; i++) { 

98 acci= acci + sample [i] * coefiti] ; 

99 accq= accq + sample [i] * coefq[i] ; 
100} 

25 101 equali= acci; 
102 equalq= accq; 
103 

104 // output 

105 FBID(out__i) << (equali); 
30 106 FBID(out_q) « (equalq) ; 

107 FBID(symtype) << (constel) ; 
108 



282 

109 // slicing 

110 esti= decide (constel, equali) ; 

111 estq= decide (constel, equalq); 
112 

5 113 // error evaluation 

114 erri= esti - equali; 

115 errq= estq - equalq; 
116 

117 // coefficient adaptation 

10 118 for(i =0; i < NF; i++) { 

119 coefi[i] =coef i [i] + step* erri * sample [i] ; 

120 coefqfi] «coefq[i] + step* errq * sample [i] ; 
121} 

122 

15 123 // reading in samples 

124 for(i = NF-1; i>= SPS; i--) 

125 sample [i] =sample [i-SPS] ; 

126 ford = SPS-1; i>= 0; i--) 

127 sample [i] =FBID (in_sample) .get() ; 
20 128 

129 return 1; 

130} 

131 

132 dfix lmsf f : :decide(dfix constel, dfix est) { 
25 133 double c = ref/3; 

134 if( constel== QAM16) { 

135 if (est > dfix(2*c)) 

136 return dfix(3*c); 

137 else if (est > dfix(0)) 
30 138 return dfix (l*c) ; 

139 elseif (est > dfix(-2*c)) 

140 return dfix (-l*c) ; 
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141 



else 



142 



return dfix (-3*c)»; 



143 } else{ 



144 



if (est > dfix (0.) ) 



145 



return dfix (3*c) ; 



146 



else 



147 



return dfix (-3*c) ; 



148 } 
149} 
150 
151// 



152 

153 ctlfsm & lmsf f : :f sm() { 

154 return fsm; 



156 

157#ifdef I2C 

158i2c_slave &lmsf f :: slave () { 

159 return _slave; 

160} 

161#endif 

162 

163 

164#define CC(a) cast (accu _type,a) 

165 void adder_tree (_sigarray & ops,int 1, int h, 
_sig&res) { 

166 if(h-l+l > 5) { 

167 cerr« "lmsf f_error maximum_5__operands_suported\n" ; 

168 exit(l); 

169 } 

170 dfix& accu_type= res. Rep () ->getVal () ; 



155} 
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171 switch (h-1+1) { 

172 case 0: res = C(res,0) ; break; 

173 case 1: res - CC(ops[l] ); break; 

174 case 2: res = CC(ops[l] +ops [1+1] ) ; break; 

5 175 case 3: res - CC(ops[l] +ops[l+l]) + CC (ops [1+2] 

) ; break; 

176 case 4: res = CC( ops [1] +ops [1+1] ) + CC( ops [1+2] 

+ops[l+3] ) ;break; 

177 case 5: res = CC ( ops[l] +ops[l+l] ) + CC(CC 
10 (ops [1+2] 

+ ops [1+3] ) +CC(ops[l+4] ) ) ; break; 

178 } 
179} 
180 

15 181 void balance_coef s2 (int numcoef s, int numcycles, int* 
l,int* h) { 
182 int i,j,k; 
183 

184 int orig_numcycles=numcycles; 
20 185 if (numcoef s < numcycles) 
186 numcycles= numcoef s; 
187 

188 int paral = numcoef s/numcycles; 

189 int incs- numcoef s- { numcoef s/numcycles) *numcycles; 
25 190 

191 for(k = 1; k <= numcycles ;k++) 

192 l[k] = (k-l)*paral; 
193 

194 for(i =1; i <= incs; i++) 
30 195 for(j = i+1; j<= numcycles; j++) 
196 ltj]++; 
197 
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198 for(k = 1; k < = numcycles-l;k++) 

199 h [k] =l[k+l]-l; 

2 00 h [numcycles] =numcoefs-l; 
201 

5 202 for(k = numcycles+1; k<= orig_numcycles;k++) { 

203 l[k] =0; 

204 h [k] = -1; 

205 } 
206 

10 207 if(l) { 

208 cout<< "Imsf f_info:_f ilter_balancing\n" ; 

209 for(k = 1; k <= orig _numcycles ;k++) 

210 cout<< l[k] « ":»<< h [k] «"_»; 

211 cout < < endl ; 
15 212 } 

213} 

214 

215 

216 void Imsff : :define() { 
20 217 

218 if (NF < 6) { 

219 cerr« "lmsf f_error :_minimum_6_coef s_required\n" ; 

220 exit(l); 

221 } 
25 222 

223 int i,k 7 p; 
224 

225 //SPS .... samples per symbolparameter 

226 //CPS .... cycles per sample (every CPS-phase read 
3 0 sample) 

227 //NCYC . . . cycle budget in the loop 

228 // F _max _del ay. . .extra delay line positions due to 
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read__sample within filtering 

229 SPS =4; 

230 CPS = 2; 

231 int Fjnax_delay = 7; 
5 232 int NCYC = SPS*CPS; 

233 

234 //==distribute filtering operation slices into NCYC-2 

cycles- 

235 

10 236 int l__f[ill00] ; 

237 int hJE[illOO] ; 

238 int l_upd[100] ; 

239 int h_upd[100] ; 
240 

15 241 //budget is fixed : 8-2=6cycles 

242 //let's have 8 coefs 

243 //can be more elaborate (e .g . interleaved slicing) 

244 int start__fil = 1 //for filtering to know to store 
first time 

20 245 int end_fil = 6 ; //for filtering to know to store to 
I_equal 
246 

l_f il [1] =0;l_f il [2] =2;l_fil [3] =4;l_fil [4] =5;l_f il [5] =6;l_f il [6] 
7; 

25 247 

h_f il til =l;h_f il [2] =3 ;h_f il [3] =4 ;h_f il [4] =5;h_f il [5] =6;h_f il [6] 

7; 

248 

l_upd [1] -0 ; l_upd [2] =2 ; l_upd [3] =4 ; l__upd [4] =5 ; l_upd [5] =6 ; l_upd [6] 
30 7; 
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249 

h_upd [1] =1 ;h_upd [2] =3 ;h_upd'[3] =4 ;h_upd [4] -5 ;h_upd [5] =6 ;h_upd [6] 
7; 

250 //was example what input we need for parametrizable 
5 filter 

definition 

251 

252 balance_coefs2 (NF, 6, l_f il,h_f il) ; 

253 balance_coef s2 (NF, 6 , l_japd, h_upd) ; 
10 254 

255 // =======def inition of signals ======= 

256 

257 PORT_TYPE(in _sample , T (T_sample_lms) ) ; 

258 PORT__TYPE (out_i , T (T_sample_lms) ) ; 
15 259 PORT_TYPE (out_q, T (T_sample_lms ) ) ; 

260 

261 df ix T_step (0, 5, 0, df ix: :ns) ;// shifts 0-> 31 
262 

263 _sigarray Fi_coef ( "Fi_coef " ,NF, &_ck, T (T_Fcoef_lms) ) ; 
20 264 _sigarray Fq_coef ( "Fq_coef " ,NF 7 &_ck, T (T__Fcoef_lms) ) ; 

265 _sigarray I_ sample ( 11 1_sample" ,NF+F_max_delay , 

&_ck,T(T^sample_lms) ) ; 

266 _sigarray Fi_mult ("Fijmilt" , NF,T(T_accu_lms) ) ; 

267 _sigarray Fqjmilt ("Fq_mult" ,NF, T (T_accu_lms) ) ; 
25 268 _sig Fi_sum ( " Fi_sum" , T (T_accuJLms) ) ; 

269 _sig Fq_sum ( l, Fq_sum" ,T (T_accuJLms) ) ; 

270 _sigarray fm _i ("frn^i" ,NF,T(T_accu_lms) ) ; 

271 _sigarray f m_q ( " f m__q n , NF , T (T_accu_lms) ) ; 

272 _sigarray fmult_i { "fmult_i" , NF,T (T_Fcoef _lms) ) ; 
30 273 _sigarray fmult_q{"fmult_q'SNF,T(T_Fcoef _lms) ) ; 

274 SIGCK(I_accu,_ck, T (T_accu_lms) ) ; 

275 SIGCK(Q_accu, __ck, T (T_accu__lms) ) ; 
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276 SIGW(I_equal, T (T_accu_lms) ) ; 

277 SIGW(Q_equal, T (T_accu_lms) ) ; 

278 SIGCK(I_error,_ck, T (T_accu_lms) ) ; 

279 SIGCK(Q_error,_ck, T (T_accu_lms) ) ; 

280 SIGW( I_slice,T(T_equal _lms) ) ; 

281 SIGW(Q_slice / T(T_equal _lms) ) ; 

282 SIGCK(step, _ck, Tjstep) ; 

283 SIGCK(constel, _ck, T_bit) ; 
284 

285#ifdef I2C 

286 _slave.put (&step) ; 

287 for(i =0; i < NF; i++) 

288 _slave.put (&Fi_coef [i] ) ; 

289 for(i = 0; i < NF; i++) 

290 _slave.put (&Fq_coef [i] ) ; 
291#endif 

292 
293 

294 // definitionof states 



295 

296 cfsm= &_fsm; // controller handle 

297 

298 int phi; 

299 state* loop_cycle [100] ; 

300 state* rst_cycle; 
301 

302 rst_cycle=new state; // define the state 

303 * rst_cycle «"rst"; // name the state 

304 * cfsm« deflt (*rst_cycle) ;// assign the state to the 

controller 
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306 for (phi = 1; phi< = NCYC ; phi++) { 

307 loop_cycle [phi ] =newsfcate ; 

308 * loop_cycle [phi] <<strapp ( "cycle_" , phi) ; 

309 * cfsm<< *loop_cycle [phi] ; 
5 310 } 

311 

312// definition of sfg's-- 

313 

10 314 sfg* JLms_filt [100] ; 

315 sfg* _lms ^update __coefs[100] ; 

316 

317 

318 SFG( lms_read_allways) ; 

15 319 GET ( cons t el_mode ) ; 

320 cons t el = constel_mode; 

321 

322 

323 SFG( lms_initialize_coef s) ; 

20 324 int offs= (NF-4)/2; 

325 Fq_coef [offs+0] =W (T (T_Fcoef_lms) ,p0) ; 

326 Fq_coef [offs+1] =W (T (T_Fcoef__lms) , 0) ; 

327 Fq_coef [offs+2] =W (T (T_Fcoef_lms) ,p2) ; 

328 Fq_coef [offs+3] =W (T (T_Fcoef_lms) , 0) ; 
25 329 

330 Fi_coef [offs+0] =W (T (T_Fcoef_lms) , 0) ; 

331 Fi_coef [offs+1] =W (T (T_Fcoef_lms) ,pl) ; 

332 Fi_coef [offs+2] =W (T (T__Fcoef__lms) , 0) ; 

333 Fi_coef [offs+3] =W (T(T_Fcoef_lms) ,p3) ; 
30 334 

335 for(i = 0; i < NF; i++) { 

336 if((i < offs) ScSc (i> offs+3)) { 
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337 Fi_coef[i] -W (T (T_Fcoef_lms) , 0) ; 

338 FqLCoef[i] -W (T (TJFcoef _lms) , 0) ; 

339 } 

340 } 
5 341 

342 

343 SFG( lms_reset) ; 

344 for(i = 0; i < NF+F_max__delay ; i++) { 

345 I_sample [i] =W(T(T_sample_lms) 0, ) ; 
10 346 } 

347 setv(I_error, 0) ; 

348 setv(Q_error, 0) ; 

349 setv (step, STEP) ; 
350 

15 351 

352 // FILTER (1. cycle to 8. cycle) - 

* 

353 int delay = 0; int cnt= 0 ; 

354 int L,H; 
20 355 

356 //no filtering in 1st clockcycle 

357 cnt++;if (cnt == CPS) { cnt= 0; delay++; } 
358 

359 

25 360 for(p = 1; p <= NCYC-2;p+ + ) { 

361 REGISTER_SFG{lmsJEilt,p) ; 

362 cnt++; if (cnt== CPS) {cnt = 0; delay++; } 
363 

364 // filter feedforward 

30 365 L = l_fil[p];H= h_f il [p] ; 
366 for (k = L; k<= H; k++) 
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367 

Fi_mult [k] =cast (T (T_accuJLms) , Fi_coef [k] I*_sample [k+delay] ) 
/ 

368 if(H >= 0) adder_tree(Fijiiult,L,H,Fi_sum) ; 
369 

370 for (k = L; k<= H; k++) 
371 

FqLmult [k] =cast (T (T_accu_lms) , Fq_coef [k] *I_sample [k+delay] ) 

372 if{H 0) adder_tree(Fqjmilt,L,H,Fq_sum) ; 

373 

374 

375 // sum I over start_ff-> end_ff 

376 if (p start_fil) { 

377 I_accu= Fi_sum; 

378 Q_accu = Fq__sum; 

379 } 

380 else if ( (p > start_f il) &&(p< end_fil)){ 

381 I_accu= I_accu+ Fi__sum; 

382 Q_accu = Q_accu+ Fq__sum; 

383 } 

384 else if (p == end_fil) { 

385 I_accu= I_accu+ Fi_sum; 

386 Q_accu = Q_accu+ Fq_sum; 

387 I_equal= I_accu+ Fi_sum; 

388 Q_equal = Q_accu+ Fq_sum; 

389 } 

390 } //end for 
391 

3 92 //compensate for 1 clockcycle vacancy 

393 cnt++;if (cnt « CPS) { cnt- 0; delay++; } 

394 
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395 

396 // UPDATE (1. cycle to 8. cycle) 

397 i n t STEPSAFE = 4; // safety region for 

downshifting 
5 398 for(p = 1; p <= NCYC-2;p++) { 

399 REGISTER_SFG(lms_update_coefs,p) ; 

400 cnt++; if (cnt== CPS) {cnt = 0; delay++; } 
401 

402 L = l_upd[p] ;H=h_upd[p] ; 
10 403 for (k=L; k< = H; k+ + ) 

404 { 

405 fm_i[k] 
=cast {T (T_accu__lms) , I_sample [k+delay] *I_error) ; 

406 vshr (fmult_i [k] , fm_i [k] , step, STEPSAFE) ; 
15 407 Fi_coef [k] =Fi_coef [k] + f mult_i [k] ; 

408 
409 

f m_q [k] =cast (T (T_accu_lms) , I_sample [k+delay] *Q_error) ; 
410 vshr (fmult_q[k] , fm_q[k] , step, STEPSAFE) ; 

20 411 FqLCoeffk] =Fq_coef [k] +fmult_q[k] ; 

412 } 

413 } 
414 

415 SFG (lms_out ready ) ; 
25 416 out_i=cast (T (T_sample__lms) , I_equal) ; 

417 out_q= cast (T(T_sample_JLms) ,Q_equal) ; 

418 symtype= const el; 
419 

420 

30 421 // SLICER 



422 SFG( lms_slice_and_error) ; 
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423 double c = ref/3; 
4 24 I_equal = I__accu ; 
425 Q_equal= Q_accu; 
426 

5 427 I_slice = (constel==W (T_bit , 0) )c. assign ( 
428 

429 (I_equal> 
C(I_equal, +2*c) ) . cassign (C (I_slice, +3*c) , 

430 (I_equal> 
10 C (I_equal, 0*c) ) . cassign (C(I__sl ice, +l*c) , 

431 (I_equal> C(I_equal, -2*c) ) .cassign (C (I_slice, - 
l*c) , 

432 C(I_slice,- 
3*c)))) 

15 433 

434 (I__equal> 
C(I_equal, 0*c) ) . cassign (C (I_slice, +3*c) , 

435 C(I_slice,- 
3*c)) 

20 436 ) ; 
437 

43 8 Q _slice= (constel==W (T_bit,0) )c. assign ( 

439 - - 

440 (Q_equal > 
25 C (Q_equal, +2*c) ) .cassign (C (Q_slice, +3*c) , 

441 (Q_equal > C (Q_equal , 0*c) ). cassign ( 
C(Q_slice,+l*c) , 

442 (Q_equal > C (Q_equal , -2*c) ). cassign (C (Q_slice, - 
l*c) , 

30 443 C(Q_slice,- 
3*c)))) 
444 
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445 (Q_equal > C (Q_equal , 0*c) ) . cassign ( 
C(Q_slice,+3*c) , 

446 C(Q_slice,- 
3*c)) 

5 447 ) ; 
448 

449 I_error=cast (T(T_accu_lms) , Ijslice) -I_equal; 

450 Q_error=cast (T (T__accu_lms) , Q_slice) -Q_equal ; 
451 

10 452 

453 // 10 definition 





454 


SFGdms in) • 




455 


GET (in sample); 


15 


456 


I sanvDlp f 01 = i n Q^nrnl c± * 




457 


i-ui V- 1 - — iMr+r i itci -A. ciej_ay — j_;i 




T JO 


I_sample lij =I_sample [i- 




459 


} 




460 




20 


461 


SFG(lms__out) ; 




462 


PUT(out_i) ; 




463 


PUT(out_q) ; 




464 


PUT(symtype) ; 




465 




25 


466 






467 


// define the f smf or 




468 






469 


DEFAULTDO(lms_read_allways) ; 


30 


470 


* rst_cycle ALLWAYS 




471 


DO(lms_reset) 




472 


DO (lms__initialize_coef s) 
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473 


« *loop_cycle [1] ; 






474 








475 


* loop_cycle [1 ] ALLWAYS 






476 


D0(lms_in) 




5 


477 


<< *_lms_update_coef s [1] 






478 


<< *loop_cycle [2] ; 






479 








480 


* loop_cycle [2] ALLWAYS 






481 


<< *_lms__filt [1J 




10 


482 


<< *_lms__update_coef s [2] 






483 


<< *loop_cycle [3] ; 


o 




484 




jj £ ; 




485 


* loop_cycle [3] ALLWAYS 


SI 




486 


D0(lms_in) 




15 


487 


« *_lms_filt [2] 






488 


« *_lms__update_coef s [3 ] 






489 


<< *loop_cycle [4] ; 


o 




490 








491 


* loop__cycle [4] ALLWAYS 




20 


492 


<< *__lms_f ilt [3] 






493 


<< *__lms_update_coef s [4] 






494 


« *loop_cycle [5] ; 






495 








496 


* loop_cycle [5] ALLWAYS 




25 


497 


D0(lms_in) 






498 


<< *_lms_filt [4] 






499 


<< *__lms_update_coef s [5] 






500 


<< *loop_cycle [6] ; 






501 






30 


502 


* loop_cycle [6] ALLWAYS 






503 


<< *__lms__f ilt [5] 






504 


<< * 1ms update coef s [6] 
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505 << *loop_cycle [7] ; 

506 

507 * loop__cycle [7] ALL WAYS 

508 DO(lms_in) 

5 509 « *_lms_filt [6] // filtering finished-> ready to 

output 

510 DO (lms_out ready) 

511 << *loop_cycle [8] ; 
512 

10 513 * loop_cycle [8] ALLWAYS 

514 DO(lms_out) 

515 DO (lms__slice_and_error) 

516 << *loop_cycle [1] ; 
517 

15 518 

519#ifdef I2C 

520 _slave. attach (__fsm, *loop_cycle [1] ,_ck) ; 

521#endif 

522 

20 523 _fsm. set info (verbose) ; 

524 of stream F0 ( 11 lmsf f_trans0 . dot n ) ; 

525 F0 « _fsm; 

526 F0 .close () ; 
527 

25 528 transform TRANSF (_fsm) ; 

52 9 TRANSF . f sm_handshakel (_ck) ; 
530 

531 ofstream F ( "lmsf f_trans . dot ") ; 

532 F « _fsm; 
30 533 F . close () ; 

534 _f sm. setinf o (silent) ; 
535 
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536 FSMEXP(typeName( ) ) ; 

537 

538} 

539 

6 . 9 rx/ macros . h 



1 // @(#) macros. hi. 1 98/01/22 
2 

10 3#infdef MACROS_H 
4#define MACR0S_H 
5 

6 // #define max(a,b) (a> b) ?a : b 
7 

15 8#include "qlib.h" 
9 

10 extern dfix T_bit; 

11 extern dfix T_2bit; 

12 extern dfix T_4bit; 
20 13 extern dfix T_8bit; 

14 extern dfix T_float; 
15 

16 extern dfix T__Cshift; // type for constant shifter 

17 extern dfix* overcast; 
25 18 extern dfix yeast; 

19 extern str stream* gstr; 

20 

21 

22#define PRT(v) FB & ##v; _sigv 

30 23#define JPRT(v) FB & _##v 

24#define IS_SIG(v,t) ## v (__##v) ,v(#v,t) 

25#define IS_REG (v, c, t ) ##v(_##v) ,v(#v,c,t) 
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26#define GET (v) in (v, ##v) 

27#define PUT(v) , OUT(v, ##v) 

28#define IS_OP(v) _##v.asSink (this) 

29#define IS_IP(v) _##v.asSource (this) 

5 3 0#define FBID (v) ## v 

31 

32#define C(y, x) W ( (y) . Rep ( ) - >getVal ( ),x) 
33#define acast(y, x) cast ( (y) .Rep () ->getVal () , ## x ) 
34 

10 35#define setv(y,x) y =W (y.RepO ->getVal ( ),x) ; 
36 

37#define REGISTER_SFG (s, i) _##s [i] =new sfg; \ 

38 _##s [i] ->next= glbListOfSfg; \ 

39 glbListOfSfg = _##s[i] ; \ 

* _##S [l] 

«strapp(strapp(#s, *•_») ,i) ; \ 

41 _##s[i]->starts( ) ; \ 

42 csfg= _##s[i] 
43 

20 44#define PORT_TYPE (v, t) v. Rep () ->dupVal (t) ; \ 

45 if (v.RepO ->isregister () )v.Rep() - 

>dupRegVal (t) 

46 

47#define DSIGW(s,n,w) s f n] 

25 =new_sig(strapp(strapp(# S/ "_») ,n) ,w) 
48 

49// constant right-shift (division) 

50// 

30 51#define shr(y, x, b) \ 

52 overcast= new dfix(0, x.RepO- 

>getVal ( ) . TypeW ( ) +b, x . Rep ( ) - 
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>getVal ( ) . TypeL ( ) +b) ; \ 

53 yeast. duplicate (y. Rep (>->getVal () ) ; \ 

54 y= cast (yeast, cast (*overcast ,x) » W(T_Cshift,b) ) ; 
\ 

55 delete overcast; 
56 

57// constant left-shift (multiplication) 

58// 

59#define shl (y, x, b) \ 

60 if (x.RepO ->getVal() .isFix() ) \ 

61 overcast = new dfix(0,x .RepO- 
>getVal () .TypeWO +b,x.Rep() - 

>getVal() . TypeL ( ) ) ; \ 

62 else\ 

63 overcast= new dfix(O) ; \ 

64 yeast. duplicate(y. Rep () ->getVal ( ) ) ; \ 

65 y= cast (yeast, cast (*overcast ,x) << W(T_Cshif t,b) ) ; 
\ 

66 delete overcast; 
67 

68// variable shifters with safety region 

69// 

70 // 

71 // description vshl (y,x,e, b) ; = :y = x«e (with 'b' as' a 
safety 

region) 

72 // 

73#define vshl (y, x, e, b) \ 
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74 overcast= new dfix(0, x.RepO - 
>getVal { ) . TypeW ( ) +b, x. Rep ( ) -> 

>getVal() .TypeL( ) ) ; \ 

75 y= acast (y, cast (*overcast,x) << e ) ; \ 
5 76 delete overcast; 

77 

78#define vshr(y, x, e, b) \ 

79 if (x.RepO ->getVal () .isFixO ) \ 

80 overcast= new dfix(0,x .RepO- 
10 >getVal () .TypeW()+b, x.RepO - 

>getVal ( ) . TypeL ( ) +b) ; \ 

81 else\ 

82 overcast= new dfix(0) ; \ 

83 y= acast (y, cast (*overcast,x) >> e ) ; \ 
15 84 delete overcast; 

85 
86 

87#endif 
88 

20 

6.10 rx/macros . cxx 



l#include "macros. h" 
2 

25 3 dfix T_bit (0,1,0, dfix: :ns) ; 

4 dfix T_2bit (0,2,0,dfix: :tc) ; 

5 dfix T_4bit (0,4,0, dfix: :ns) ; 

6 dfix T_8bit (0,8,0, dfix: :ns) ; 

7 dfix T_float (0) ; 
30 8 

9 dfix T_Cshift (0,4,0, dfix:n:s) ; //type for constantshif ter 
0. .15 
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10 dfix* overcast; 

11 dfix yeast; 

12 strstream* gstr; 

5 6.11 rx/typedef ine.cxx 

1 # include " t ypede f ine . h " 
2 

3#include <f stream. h> 
10 4 

5 typedefine glbTypes; 
6 

7 typedefine: : typedefine () { 

8 numt= 0; 
15 9 } 

10 

11 void typedefine: : load (char *_name) { 

12 if stream IF(_name); 
13 

20 14 if (IF.failO) { 
15 

cerr« » * * *_ERROR : Jzypedef ine :_cannot_open_f ile_» «_name« " \ 
exit (0) ; 

} 

while (! IF. eof () && IIF.f a(i)l) { 
char buf [100] ; 
IF >> buf; 

if (Istrlen(buf) ) 
continue; 





n" 




16 


25 


17 




18 




19 




20 




21 


30 


22 




23 




24 
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25 

26 if (buf [0] ■/' EcSc buffi] == '/') { 

27 int endoftype - 0; 

28 while (! endoftype) { 
5 29 char c; 

30 IF. get (c) ; 

31 endof type= (c == f \n f ) ; 

32 } 

33 continue; 
10 34 } else { 

35 name[numt] = new char [strlen (buf ) +1] ; 

36 strcpy (name [numt] , buf) ; 

37 int i; 

38 for (i=0; i<numt; i++) 

15 3 9 if (!strcmp(name[i] ,buf) ) { 

40 cerr<< 
« * * *_ERROR :_typedef ine :_type_» «buf « "_def ined_twice\n» ; 

41 exit(0); 

42 } 

20 43 int 

W 7 L, repr=df ix : : tc , overf low=df ix : e : rr , truncate=df ix : f : 1 ; 
44 

45 IF » buf; 

46 W = atoi (buf) ; 
25 47 if (W == 0) { 

48 cerr<<»***_ERROR;_typedef ine :_bad_W_f or__type_" 
«name [numt] "<<\n" ; 

49 exit (0) ; 

50 } 
30 51 

52 int endcom = 0; 
53 
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54 IF » buf; 

55 L = atoi (buf) ; 

56 if (buf [strlen(buf)-l] = =';•){ 

57 endcom = 1; 

5 58 buf [strlen(buf) -1] =0; 

59 } 

60 while (1) { 

61 if (endcom) 

62 break; 
10 63 

64 IF >> buf; 
65 

66 if (buf [strlen(buf) -1] ==<;'){ 

67 endcom =1; 

15 68 buf [strlen(buf) -1] =0 ; 

69 } 
70 

71 if( !strcmp(buf , "ns") ) 

72 repr = dfix: :ns; 

20 73 else if ( Istrcmp (buf , "tc n ) ) 

74 repr = dfix::tc; 

75 else if ( Istrcmp (buf , M ; w ) ) 

76 break; 

77 else if ( ! endcom) { 

25 78 cerr<< "***_ERROR:_typedef ine:_"<<name [numt] "<< 
_bad_repr_J f «buf <<"\n" ; 

79 exit(0); 

80 } 
81 

30 82 

83 if (endcom) 

84 break; 
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85 

86 IF >> buf; 
87 

88 if (buf [strlen(buf) -1] ==';'){ 

5 89 endcom = 1; 

90 buf [strlen(buf) -1] =0 ; 

91 } 
92 

93 if ( !strcmp(buf , "wp") ) 

10 94 overflow = df ix: :wp; 

95 elseif C ! strcmp (buf , "st") ) 

96 overflow = dfix::st; 

97 elseif ( ! strcmp (buf , "er" ) ) 

98 overflow = dfix: :err; 

15 99 elseif ( ! strcmp (buf ,";») ) 

100 break; 

101 elseif ( ! endcom) { 

102 cerr<<"***_ERROR:_typedefine:_"<<name [numt] "<< 
_bad_ovf <<buf << " \n" ; 

20 103 exit(0); 

104 } 
105 

106 if (endcom) 

107 break; 
25 108 

109 IF » buf ; 
110 

111 if (buf [strlen(buf) -1] ==';'){ 

112 endcom = 1; 

30 113 buf [strlen(buf ) -1] =0 ; 

114 } 
115 
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116 if( !strcmp(buf , »rd H ) ) 

117 truncate = dfix::rd;> 

118 elseif ( I strcmp (buf , " f 1" ) ) 

119 truncate = dfix::fl; 

5 120 elseif ( ! strcmp (buf ,";") ) 

121 break; 

122 elseif ( lendcom) { 

123 cerr« 
"***JBRROR:_typedef ine :_"«name [numt] "« : Joad_rnd_"* 

10 *<<buf<<"\n" ; 

124 exit(0); 

125 } 
126 

127 if(endcom) 

15 128 break; 
129 

130 int endoftype = 0; 

131 while ( I endoftype) { 

132 char c; 

20 133 IF. get (c); 

134 endoftype = (c== 1 \n 1 ) ; 

135 } 

136 break; 

137 } 
25 138 

types [numt] .duplicate (dfix(0 / W,L, repr, overflow, truncate) ) ; 
139 

140 numt++; 

141 if (numt >= MAXT) { 
30 142 cerr<< "***_ERROR: 

_typedef ine_has_too_much_types ._increase_MAXT\n ,f ; 

143 exit (0) ; 
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144 } 

145 } 

146 } 
147} 

5 148 

149 void typedef ine: :list () { 

150 int i; 
151 

152 for(i=0; i<numt; i++) { 

10 153 cout .width (20) ; 

154 cout<< name[i] ; 
155 

156 cout .width (5) ; 

157 cout« types [i] .TypeW(); 
15 158 

159 cout. width (5) ; 

160 cout« types [ i] .TypeL(); 
161 

162 cout .width (4) ; 

20 163 if (types [i] .TypeSign() ==dfix: :ns) 

164 cout << "ns"; 

165 else 

166 cout << "tc n ; 
167 

25 168 cout .width (4) ; 

169 if(types[i] .TypeOverf low() ==dfix: :wp) 

170 cout << "wp"; 

171 elseif (types [i] .TypeOverf low () ==dfix: :st) 

172 COUt << "St" ; 

30 173 else 

174 cout << "err"; 
175 
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176 cout .width (4) ; 

177 if (types [i] .TypeRoundO ==dfix::fl) 

178 cout « "fl"; 

179 else 

5 180 cout « "rd" ; 

181 

182 cout« "\n"; 

183 } 
184} 

10 185 

186 static dfix dummy(O); 
187 

188dfix &typedefine :: find (char *_name) { 
189 int i; 
15 190 if( inumt) 

191 return dummy; 

192 for(i=0; i<numt; i++) 

193 if ( ! strcmp (name fi] ,_name) ) 

194 return types [i] ; 

20 195 cer r « " * * *_WARNING : _t ypede f ine : 

_type_ ,f < <_name < < 11 _wa s_no t__f ound\ n " ; 
196 return dummy; 
197} 
198 

25 199 dfix &typedef ine :: find (char *_name, dfix& v) { 

200 int i; 

201 if( !numt) 

202 return v; 

203 for(i=0; i<numt; i++) 

30 204 if( ! strcmp (name [i] ,_name)) 
205 return types [i] ; 

2 06 cerr« " ***_WARNING:_t ypede f ine : 
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_type__ " < <_name< < "_was_not_f ound\n " ; 
207 return v; 
208} 
209 

6 . 12 rx/typedef ine . h 

l#infdef TYPEDEFINE_H 
2#define T YPEDE F I NE__H 
3 

4#define MAXT 100 
5 

6#include "qlib.h" 

7 

8 

9 class typedefine{ 

10 char *narae[100] ; 

11 dfix types [MAXT] ; 

12 int numt; 

13 public: 

14 typedef ine ( ) ; 

15 void load (char *file) ; 

16 void list () ; 

17 dfix &find(char *name) ; 

18 dfix &find(char *name, dfix& v) ; 

19 }; 

20 

21 extern typedef ine glbTypes; 
22 

23#define LOADTYPES (a) glbTypes . load (#a) ; glbTypes . list () 
24#define T(a) glbTypes . find (#a) 
25#define TT(a,b) glbTypes . find (#a,b) 
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26 

27#endif 

Part C: Generated VHDL code of the QAM system 

5 

6.13 vhdl /RXJTI . vhd 
1 

10 2 --OCAPI - alpha release- generated Fri Jun 12 
16:45:441998 

3 

4 

15 5 - System Link Cell for design RXJTI 
6 

7 library IEEE; 

8 use IEEE. std__logic_1164 .all; 
9 

20 10 entity RXJTI is 



11 port( 

12 reset: in std_logic; 

13 elk: in std__logic; 

14 chan_out: in std_logic_vector (11 downto 
25 0); 

15 rx_dif f_mode: in stdJLogic; 

16 rx_constel_mode: in std_logic; 

17 rx_byte_out: out std_logic_vector (7 downto 
0) ; 

30 18 rx_sync_out: out std_logic 

19 ) ; 



20 end RX__TI ; 
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21 

22 architecture structure of. RX TI is 



23 

24 component 1ms ff 

5 25 port ( 

26 reset: in std_logic; 

27 elk: in std_logic; 

28 hlwack: in std_logic; 

29 const el_mode : in std_logic; 

10 30 in_sample: in std_logic_vector (11 downto 
0) ; 

31 hlwreq: out std__logic; 

32 out_i:out std__logic_vector (11 downto 
0); 

15 33 out_q: out std__logic_vector (11 downto 

0); 34 symtype: out std_logic 

35 ) ; 

3 6 endcomponent ; 
37 

20 38 component demap 

39 port( 

40 reset: in std__logic; 

41 elk: in std__logic; 

42 h2wack: in std_logic; 
25 43 hlrack: in std__logic; 

44 diff_mode: in std_logic; 

45 i_i n: i n std_logic_vector (11 downto 
0) ; 

46 q_in: in std_logic_vector (11 downto 
30 0); 

47 symtype_in: in std_logic; 

48 h2wreq: out std_logic; 
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4 9 hlrreq: out std_logic; 

50 symbol_out>: out std_logic_vector (3 downto 
0) ; 

51 symtype_out : out std__logic 
5 52 ) ; 

5 3 endcomponent ; 
54 

55 component de tuple 

56 port ( 

10 57 reset: in std_JLogic; 

58 elk: in std_logic; 

59 h3wack: in std_logic; 

60 h2rack: in std__logic; 

61 symbol: in std__logic_vector (3 downto 
15 0) ; 

62 symtype: in std_logic; 

63 h3wreq: out std__logic; 

64 h2rreq: out std_logic; 

65 byte: out std_logic_vector (7 downto 
20 0) ; 

66 syncro: out std_logic 

67 ) ; 

6 8 endcomponent ; 
69 

25 70 component derand 

71 port( 

72 reset: in std_logic; 

73 elk: in std_JLogic; 

74 h3rack: in std_logic; 

30 75 byte_in: in std_logic_vector (7 downto 
0) ; 

76 syncro: in std_logic; 
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77 
78 
0) j 
79 
80 
81 
82 
83 
84 
85 



h3rreq: out std_logic; 

byte_out : out std_logic_vector ( 7 downto 
sync__out : out std_logic 



) ; 

endcomponent ; 



signal 
signal 
signal 
downto 0) ; 

86 signal 
downto 0) ; 

87 signal 

88 signal 

89 signal 

90 signal 
downto 0) ; 

91 signal 

92 signal 

93 signal 

94 signal 
downto 0) ; 

95 signal 

96 signal 
97 

98 begin 
99 

100 lmsf f_proc : lmsf f 

101 port map ( 
102 

reset , 



unused : std_logic; 
hl_f f shk : std_logic ; 

rx_lms_i : std_logic_vector ( 11 

rx_lms_q: std_JLogic_vector (11 



rx_symtype 
h2_ffshk 
hi fbshk 



std__logic; 
std_logic; 
std_logic; 



rx_ symbol : std_logic__vector (3 

rx_symtype_at : std_logic ; 

h3_f f shk : std_logic ; 
h2_f bshk : std_logic ; 

rx_byte_rnd : std_logic_vector ( 7 

rx_syncro : std_logic ; 
h3_f bshk : std_JLogic ; 



reset=> 
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103 

elk, 

104 

hl_fbshk, 
5 105 

rx__constel_mode , 
106 

chan_out , 
107 

10 hl__ffshk, 
108 

rx_lms,_i 
109 

rx_lms , _q 
15 110 

rx__symtype 
ill ) ; 
112 

113 demap_proc : demap 
20 114 port map ( 

115 

reset, 
116 
elk, 
25 117 

h2_fbshk, 
118 

hl_ffshk, 
119 

3 0 rx__d i f f _mode , 
120 

rx lms, i 



clk=> 
hlwack=> 
constel_mode=> 
in_sample=> 
hlwreq=> 
out_i=> 
out__q=> 
symtype=> 

reset=> 
clk-> 
h2wack=> 
hlrack=> 
dif f_mode=> 
i in~> 
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121 

rx_lms , __q 
122 

rx_symtype , 
5 123 

h2_ffshk, 
124 

hl_fbshk, 
125 

10 rx_symbol, 
126 

rx_symtype_at 
127 ) ; 
128 

15 129 detuplejproc :detuple 
130 port map ( 

131 

reset, 
132 
20 elk, 
133 

h3_fbshk, 
134 

h2__f fshk, 
25 135 

rx__symbol , 
136 

rx_symtype_at , 
137 

30 h3_ffshk, 
138 

h2 fbshk, 



q_m=> 
symtype_in= > 
h2wreq=> 
hlrreq=> 
symbol_out = > 
symtype_out = > 



reset=> 
clk=> 
h3wack=> 
h2rack=> 
symbol => 
symtype=> 
h3wreq=> 
h2rreq-> 
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139 

rx__byt e_r nd , 
140 

rx_syncro 
5 141 ) ; 
142 

143 derand_j>roc : derand 

144 port map ( 
145 

10 reset, 
146 
elk, 
147 

h3_ffshk, 
15 148 

rx__by t e_rnd , 
149 

rx_syncro , 
150 

20 h3JEbshk, 
151 

rx_byte_out, 
152 

rx_jsync_ out 
25 153 ) ; 
154 

155 end structure; 

6.14 vhdl/derand_j?roc 



byte=> 



syncro=> 



reset-> 



clk=> 



h3rack=> 



byte_in=> 



syncro=> 



h3rreq=> 



byte_out=> 



sync_out=> 



.vhd 



1 
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-OCAPI - alpha release- generated Thu Jun 11 14:57:23 



1998 




3 


-- includes sfg 


4 


der andr s t pha s e 1 0 


5 -- 


derandphaselphase20 


6 -- 


derandphase lphase 1 1 


7 


derandphase2phasel0 


8 


derandinireg_derandrstO 



10 - 

10 

11 library IEEE; 

12 use IEEE. std__logic_1164. all; 

13 uselEEE . std_logic_arith . all ; 
15 14 library FXT_PNT_LIB; 

15 use FXT_PNT_LIB.pck_f ixed_point . all; 
16 

17 entity derand_jproc is 

18 port ( 

20 19 elk: in std_logic; 

20 reset: in std_logic; 

21 h3rack: in FX (0 downto 0) 

22 syncro: in FX (0 downto 0) 

23 byte_in:in FX (7 downto 0) 
25 24 h3rreq: out FX (0 downto 0 ) ; 

25 h3rackreg_reg:outFX (0 downto 0) ; 

26 byte_ouT_reg:outFX<7 downto 0) ; 

27 sync_ouT_reg:outFX(0 downto 0) 

28 ) ; 

30 29 end derand_proc; 



6,15 vhdl/derandjproc_RTL . vhd 
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1 , 

2 --OCAPI - alpha release- generated Thu Jim 11 14:57:23 
1998 

3 -- includes sfg 

4 derandrstphaselO 

5 -- derandphaselphase20 

6 derandphaselphasell 

7 -- derandphase2phasel0 

8 -- derandinireg_derandrstO 

9 

10 

11 library IEEE; 

12 use IEEE. std_logic_1164. all; 

13 uselEEE. std_logic_arith. all ; 

14 library FXT_PNT_LIB; 

15 use FXT_PNT_LIB.pck_fixed_point .all; 
16 

17 architecture RTL of derand_proc is 
18 

19 State Declaration 

20 signal seed_atl : FX (15 downto 0) ; 

21 signal seed : FX (15 downto 0) ; 

22 signal shif treg_atl :FX (15 downto 0) ; 

23 signal shiftreg : FX (lSdownto 0} ; 

24 signal bypass_atl: FX(0 downto 0) ; 

25 signal bypass : FX (0 downto 0) ; 

26 signal h3rackreg_atl : FX (0 downto 0); 

27 signal h3rackreg : FX(0 downto 0) ; 

28 signal byte_out_atl : FX (7 downto 0); 
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29 signal byte_out : FX (7 downtoO); 

30 signal sync_out__at 1 : FX { 0 downto 0) ; 

31 signal sync_out: FX (0 downtoO) ; 

32 type STATE__TYPE is ( 
5 33 rst, 

34 phasel, 

35 phase2, 

36 inireg__derand) ; 

37 signal current_state,next_state: STATE JTYPE; 
10 38 

39 begin 
40 

41 h3rackreg_reg<=h3rackreg_atl ; 
42 

15 43 byte_out_reg<=byte__out_atl; 
44 

4 5 sync_out_reg<=sync__out_at 1 ; 
46 

47 Register clocking 

20 48 SYNC : process (elk) 
49 

50 begin 

51 if (elk 1 event and clk= 1 1 1 ) then 

52 state update 

25 53 current_state<= next_state; 

54 — tick all registers 

55 seed_atl<= seed; 

56 shif treg_atl<= shiftreg; 

57 bypass_atl<- bypass ; 

30 58 h3rackreg_atl<= h3rackreg; 

59 byte_out_atl<=byte_out ; 

60 sync_out_atl<=sync_out ; 
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61 


end if; 






62 


end process 






63 








64 


-- SFG evaluation 


5 


65 


COMB : process ( 




66 


current_; 


state, 




67 


reset, 






68 


h3rack, 






69 


syncro, 




10 


70 


seed_atl, 




71 


shif treg_atl, 




72 


bypass_atl , 




73 


byte_in, 






74 


h3rackreg_atl / 


15 


75 


byte_out 


_atl, 




76 


sync_out_ 


_atl ) 




77 








78 


intermediate variables 




79 


variable 


shifts_0 : FX (15 downto 0) ; 


20 


80 


variable 


xbits_0: FX (0 downto 0); 




81 


variable 


masks_0 :FX (7 downto 0) ; 




82 


variable 


shifts_l : FX (15 downto 0) ; 




83 


variable 


xbits_l:FX (0 downto 0) ; 




84 


variable 


masks_l :FX (7 downto 0) ; 


25 


85 


variable 


shifts_2 : FX(15 downto 0) ; 




86 


variable 


xbits_2:FX (0 downto 0) ; 




87 


variable 


masks_J2 :FX (7 downto 0) ; 




88 


variable 


shifts__3 : FX(15 downto 0) ; 




89 


variable 


xbits_3;FX (0 downto 0) ; 


30 


90 


variable 


masks_3 :FX (7 downto 0) ; 




91 


variable 


shifts_4 : FX(15 downto 0) ; 




92 


variable 


xbits_4:FX (0 downto 0); 
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93 


variable 


masks_4 :FX (7 downto 0); 




94 


variable 


shifts_5 :.FX(15 downto 0) 




95 


variable 


xbits_5:FX (0 downto 0) ; 




96 


variable 


masks_5 :FX (7 downto 0); 


5 


97 


variable 


shifts_6 : FX (15 downto 0) 




98 


variable 


xbits_6:FX (0 downto 0); 




99 


variable 


masks__6:FX (7 downto 0) ; 




100 


variable 


shifts_7 : FX (15 downto 0) 




101 


variable 


xbits_7:FX (0 downto 0) ; 


10 


102 


variable 


masks_7 :FX (7 downto 0); 




103 


variable 


shifts_8 : FX (15 downto 0) 




104 


variable 


masks__8 :FX (7 downto 0) ; 




105 


variable 


mask : FX (7 downto 0) ; 




106 






15 


107 
108 


begin 






109 


-- update 


all registers and outputs 




110 


h3rreq <= 


CAST ("0. " ) ; 




111 


seed <= seed_atl; 


20 


112 


shif treg< 


:= shif treg_atl; 




113 


bypass <= 


bypass_atl; 




114 


h3 rackreg 


<= h3rackreg_atl; 




115 


byte__out< = 


= byte_out_atl; 




116 


sync_out<= 


sync_out_atl ; 


25 


117 
118 








119 


-- default 


update state register 




120 


next_j3tate<=current_state ; 




121 






30 


122 


case current_state is 




123 








124 


when rst 
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125 

126 byte_out<= CAST (F» 00000000 . " ) ; 

127 seed <= CAST ("0000000000111111. » ) ; 

128 sync_out<= CAST("0 . " ) ; 
5 129 bypass <= CAST("0 . " ) ; 

130 shiftreg<= CAST ( "0000000000000000 . " ) ; 

131 h3rackreg<= h3rack; 

132 h3rreq <= CAST{"1 . " ) ; 

133 next_state<= phasel; 
10 134 

135 

136 when phasel=> 
137 

138 if { (true) and( ToBool (h3rackreg_atl) ) ) then 

15 139 shifts__0:= cassign (syncro=CAST ( "1 . 11 ) , 

140 seed_atl f 

141 shif treg_atl) ; 



142 masks_0 :=CAST ("00000000. " ) ; 

143 xbits_0:= 

20 

(CAST(0,0 # SHR(shifts_0 f 4) ) ) xor (CAST (0 , 0 , SHR (shif ts_0 , 5) ) ) ; 
144 

shifts_l:= ( (CAST (15,0, xbits_0) ) and (CAST ("0000000000000001." 
))) 

25 or( (SHL(shifts_0 # l) ) and (CAST ( "0000000001111111 . " ) ) ) 

145 masks_l : = (SHL (raasks_0 , 1) ) or ( (CAST (7 , 0 , xbits_0) ) and 
(CAST ("00000001. » ) ) ) ; 

146 xbits_l := 

30 

(CAST(0, 0,SHR(shifts_l,4) ) ) xor (CAST (0 , 0 , SHR (shif ts_l , 5) ) ) ; 
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147 

shif tsJ2 := ( (CAST (15, 0 / xbits_jl) ) and (CAST ( "0000000000000001 . " 
))) 

or ( (SHL (shif ts_l, 1) ) and (CAST (" 0000000001111111 . 11 ) ) ) 

} 

148 masks_2 SHL (masks_l , 1 ) ) or ( (CAST (7,0, xbi t s_l ) ) and 
(CAST ("00000001. " ) ) ) ; 

149 xbits_2:= 
(CAST(0,0,SHR(shifts_2,4) ) ) xor (CAST (0 , 0 , SHR (shif ts_2 , 5) ) ) ; 
150 

shifts_3:=( (CAST (15,0, xbi ts_2) ) and(CAST ( "0000000000000001 . " 
))) 

or ( (SHL (shif ts_2, 1) ) and (CAST ("0000000001111111. " ) ) ) 

151 masks_3 SHL (masks__2 , 1) ) or ( (CAST (7 , 0 , xbits_2 ) ) and 
(CAST ("00000001. " } ) ) ; 

152 xbits_3: = 
(CAST (0 , 0 , SHR (shif ts_3 , 4) ) ) xor (CAST (0 , 0 , SHR (shif ts_3 , 5) ) ) ; 
153 

shifts_4 : = ( (CAST (15, 0,xbits_3) ) and (CAST ( 11 0000000000000001 . " 
))) 

or ( (SHL (shif ts_3,l) )and(CAST{"0000000001111111. » ) )>) 

154 masks_4 := SHL (masks_3 , 1) ) or ( (CAST (7, 0 , xbits_3) ) and 
(CAST ("00000001. " ) ) ) ; 

155 xbits_4: = 
(CAST (0 , 0 , SHR (shif ts_4 , 4) ) ) xor (CAST (0, 0 , SHR (shif ts_4 , 5) ) ) ,- 
156 

shifts_5:= ( (CAST(15,0,xbits_4) ) and (CAST (" 0000000000000001 . " 
))) 

or ( (SHL (shif ts_4, 1) ) and (CAST ("0000000001111111 . " ) ) ) 
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157 masks_5 := SHL (masks_4 , 1) ) or ( (CAST (7,0, xbits_4) ) and 
(CAST ("00000001. " ) ) )■ ; 

158 xbits_5:= 
( CAST (0,0, SHR ( shi f t s_5 , 4 } ) ) xor ( CAST (0,0, SHR ( shi f t s_5 , 5 ) ) ) ; 

5 159 

shifts_6:=( (CAST(15,0,xbits_5) ) and (CAST ( "0000000000000001 . " 
))) 

or( (SHL(shifts_5,l) ) and (CAST (" 0000000001111111 . " ) ) ) 

f 

10 160 raasks_6 := SHL (masks_5 , 1) ) or ( (CAST (7 , 0 , xbits_5) ) and 
(CAST ("00000001. " ) ) ) ; 
161 xbits_6:= 
(CAST(0,0,SHR(shifts_6,4) ) ) xor (CAST (0 , 0 , SHR (shif ts_6 , 5) ) ) ; 
162 

15 shifts_7:=( (CAST (15, 0, xbits_6) ) and (CAST ("0000000000000001 . " 
))) 

or( (SHL (shif ts_6,l) ) and(CAST("0000000001111111 . " ) ) ) 

163 masks_7 : = SHL (masks_6 , 1) ) or ( (CAST (7, 0 , xbits_6) ) and 
20 (CAST ("00000001. " ) ) ) ; 

164 xbits_7: = 
(CAST(0, 0, SHR(shifts_7,4) ) ) xor (CAST (0 , 0 , SHR (shif ts_7 , 5) ) ) ; 
165 

shifts_8 := ( (CAST (15, 0,xbits_7) ) and (CAST (" 0000000000000001 . " 
25 )}) 

or ( (SHL (shif ts_7 f 1) ) and (CAST ( "0000000001111111 . » ) ) ) 

166 masks_8 := SHL (masks_7 f 1) ) or ( (CAST (7, 0, xbits_7) ) and 
(CAST ("00000001. » ) ) ) ; 
3 0 167 shiftreg<= shifts_8; 

168 mask := masks_8; 

169 byte_out<= cassign (bypass_atl=CAST ( " 1 . " ) , 
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170 byte_in, 

171 (byte_in)xor (mask) ) ; 

172 sync_out<=CAST ("1. n ) ; 

173 h3rackreg<= h3rack; 

5 174 h3rreq<- CAST(»0 . " ) ; 

175 next_state<= phase2 ; 

176 end if; 
177 

178 if (not (ToBool (h3rackreg_atl) ) ) then 

10 179 h3rreq<= CAST("1 . 11 ) ; 

180 h3rackreg<= h3rack; 

181 next__state<= phasel; 

182 end if; 
183 

15 184 

185 when phase2=> 
186 

187 h3rackreg<= h3rack; 

188 sync_out<= CAST("0 . " ) ; 
20 189 h3rreq <= CAST( M 1 . « ) ; 

190 next_state<= phasel; 

191 

192 

193 when inireg_derand=> 
25 194 

195 seed <= CAST ("0000000000000000. " ) ; 

196 shiftreg<= CAST("0000000000000000 . " ) ; 

197 bypass <= CAST(»0 . " ) ; 

198 byte_out<= CAST ("00000000. " ) ; 
30 199 sync_out<= CAST{"0 . " ) ; 

200 next__state<= rst; 
201 
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202 

2 03 when others => 

204 next_state<= current_state ; 

205 end case; 
5 206 

207 if (reset = *1' ) then 

2 08 next_jstate<= inireg_derand; 

209 seed <= CAST ("0000000000000000. " ) ; 

210 shiftreg <= CAST(" 0000000000000000. '» ) 
10 211 bypass <= CAST ("0. » ) ; 

212 h3rackreg<= CAST("0 . " ) ; 

213 byte_out<= CAST(" 00000000. " ) ; 

214 sync_out<= CAST("0 . " ) ; 

215 end if; 
15 216 

217 

218 end process; 
219 

220 end RTL; 



20 



6.16 vhdl/derandjproc_STD . vhd 



25 2 --OCAPI - alpha release- generatedThu Jun 11 14:57:23 
1998 

3 - includes sfg 

4 derandrstphaselO 

5 derandphaselphase20 
3 0 6 derandphaselphasell 

7 derandphase2phase!0 

8 -- derandinireg_derandrstO 
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10 



11 library IEEE; 
5 12 use IEEE. std_logic_l 164 .all; 

13 use IEEE, std_logic.arith. all; 

14 library FXT_PNT_LIB; 

15 use FXT_JPNT_LIB.pck_f ixed_j>oint .all; 
16 

10 17 entity derand is 
18 port( 
19 
20 
21 

15 22 

23 

downto 0) ; 
24 
25 

20 26 

downto 0) ; 
27 

28 ) ; 

29 end derand; 
25 30 

31 architecture structure of derand is 
32 

33 component derand_proc 

34 port ( 
30 35 elk : in std_logic; 

36 reset: in std__logic; 

37 h3rack : in FX (0 downto 0) ; 



elk : in std_logic; 
reset: in std__logic; 
h3rack : in std_logic; 
syncro: in std_logic; 

byte_in: in std_logic_vector (7 

h3rreq: out std_logic; 
h3rackreg: out std_logic; 

byt e_out : out s t d__logic_vector ( 7 

sync_out : out std_logic 



327 

38 syncro : in FX (0 downto 0) ; 

39 byte__in ; in FX (7 downto 0) ; 

40 h3rreq : out FX (0 downto 0) ; 

41 h3rackreg_reg:outFX (0 downto 0) ; 
5 42 byte_out___reg:outFX (7 downto 0) ; 

43 sync_out_reg:outFX (0 downto 0) 

44 ) ; 

4 5 endcomponent ; 
46 

10 47 signal FX_h3rack : FX( 0 downto 0) ; 

48 signal FX__syncro : FX( 0 downto 0) ; 

4 9 signal FX_byte_in : FX (7 downto 0); 

50 signal FX_h3rreq : FX( 0 downto 0) ; 

51 signal FX_h3rackreg :FX (0 downto 0) ; 
15 52 signal FX_byte__out :FX (7 downto 0) ; 

53 signal FX_sync_out :FX (0 downto 0) ; 
54 

55 begin 
56 

20 57 FX_Ji3rack(0) <=h3rack; 

5 8 FX_syncro ( 0 ) < =syncro ; 

59 FX_byte_in<= FX (SIGNED (byte_in) ) ; 

60 h3rreq<= FX_h3rreq(0) ; 

61 h3 rackreg< « FX _Ji3 rackr eg ( 0 ) ; 

25 62 by t e_out < = C0NV_S TD_L0G I C_VECTOR 

(ToSigned(FX_byte_out) ,byte_out ' length) ; 

6 3 sync_out < = FX_sync_ou t ( 0 ) ; 
64 

6 5 derand : derand_j?roc 

30 66 port map ( 

67 elk elk, 

68 reset => reset, 
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69 



h3rack => FX h3rack, 



71 



70 



72 



syncro => FX_syncro, 
byte_in=> FX__byte_in, 
h3rreq => FX_h3rreq, 



5 



74 



73 



h3rackreg_reg=> FX_h3rackreg, 
byte_out__reg=>FX_byte_out , 



75 



sync_ou t_reg= > FX__sync_out 



76 } ; 
77 

10 78 

79 end structure; 

6 . 17 vhdl/derand_tb . vhd 
15 i 

2 --OCAPI-alpha release-generated Fri Jun 12 16:45:45 1998 



20 4 

5 -- TestBench for design derand 
6 

7 library IEEE; 

8 use IEEE. std_logic_l 164 .all; 
25 9 

10 use IEEE. std_ 1 ogic_textio. all; 

11 use std. textio.all; 
12 

13 library clock; 
30 14 use clock. clock. all; 
15 

16 entity derand_tb is 



3 
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17 end derand__tb; 
18 

19 architecture rtl of derand_tb is 
20 

21 signal 

22 signal 

23 signal 

24 signal 



down to 0) ; 
10 25 signal 

26 signal 

27 signal 

28 signal 
downto 0) ; 

15 29 signal 
30 



component derand 
port ( 



31 
32 
33 

20 34 

35 
36 

downto 0) ; 
37 

25 38 

39 

downto 0) ; 
40 

41 ) ; 
3 0 42 end component; 
43 
44 



reset 
elk 
h3rack 



std_logic; 
std__logic; 
std logic; 



byte_in : std_logic_vector ( 7 



syncro : std_logic; 
h3rreq : std_logic; 
h3rackreg : std_logic; 

byte_ out : std_logic_vector (7 

sync_out : std_logic ; 



reset: in std_logic; 
elk: in std_logic; 
h3rack: in std_logic; 

byte_in: in std_logic_vector (7 

syncro: in std_logic; 
h3rreq: out std_logic; 

byt e__out : out std_logic_vec tor ( 7 

sy nc_ou t : out s t d__l og i c 
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45 begin 
46 

47 crystal (elk, 50 ns) ; 
48 

5 49 derand_dut: derand 
50 port map ( 

51 

reset, 
52 

10 elk, 
53 

h3rack, 
54 

byte__in, 
15 55 

syncro, 
56 

h3rreq, 
57 

20 byte__out, 
58 

syncjDUt ) ; 
59 ini: process 
begin 

reset<= ■ l" ; 
wait until elk 'event and elk = 1 1 ? ; 

reset<= '0 f ; 
wait; 
65 end process; 
30 66 

67 input : process 

file stimuli: text is in "derand_tb. dat " ; 



60 

25 61 

62 
63 
64 



reset=> 
clk=> 
h3rack=> 
byte_in=> 
syncro=> 
h3rreq=> 
byte_out=> 
sync out=> 



68 
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69 variable aline : line; 
70 

71 file stimulo: text is out "derand_tb. sim_out" ; 

72 variable oline : line; 
5 73 

74 variable v_h3rack: std__logic; 

75 variable v_byte_in: std_logic_vector (7 
down to 0) ; 

76 variable v_syncro: std_logic; 
10 77 variable v_h3rreq: std_logic; 

78 variable v_byte_out : std_logic__vector (7 
down to 0) ; 

79 variable v_sync_out: std_logic; 

80 variable v_h3rack__hx: std_logic; 

15 81 variable v_byte_in_hx: std_logic_vector (7 
down to 0) ; 

82 variable v_syncro_hx : std_logic; 

83 variable v_h3rreq__hx: std_logic; 

84 variable v_by te_out__hx : std_logic_yector (7 
20 downto 0) ; 

85 variable v_sync_out_hx : std_logic; 
86 

87 begin 

88 wait until reset 1 event and reset = 1 0 1 ; 
25 89 loop 

90 if (not (endfile (stimuli) )) then 

91 readline (stimuli, aline); 

92 read (aline, v_h3rack) ; 

93 read(aline, v_byte_in) ; 
30 94 read(aline, v_syncro) ; 

95 else 

96 assert false 
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report "End of inputfile reached" 





98 


severity warning; 




99 


end if; 




100 




5 


101 


h3rack <= v_h3rack; 




102 


byte in<= v_byte_in; 




103 


syncro <= v_syncro; 




104 






105 


wait for 50 ns; 


10 


106 






107 


v_h3rreq:= h3rreq; 




108 


v byte_out :=byte_out; 




109 


v sync out : =sync_out ; 




110 




15 


111 


v h3rack__hx:=v_h3rack; 




112 


v byte_in_hx:=v_byte;_in 




113 


v syncro_hx : =v_syncro ; 




114 


v_h3rreq_hx : =v_h3 rreq ; 




115 


v byte out_hx:=v_byte_out ; 


20 


116 


v sync out_hx : =v_sync_out ; 




117 






118 


write (oline, v_h3rack_hx) ; 




119 


write (oline, 1 1 ) ; 




120 


hwrite (oline, v_byte__in)_hx; 


25 


121 


write (oline, ' 1 ) ; 




122 


write (oline, v_syncro_hx) ; 




123 


write (oline, 1 ' ) ; 




124 


write (oline, v_h3rreq_hx) ; 




125 


write (oline, 1 1 ) ; 


30 


126 


hwrite (oline, v__byte__out ) __hx; 




127 


write (oline, 1 1 ) ; 




128 


write (oline, v_sync_out) _hx; 
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129 write (oline, 1 1 ) ; 

130 

131 writeline (stimulo, oline); 

132 

5 133 wait until elk 1 event and elk = ' 1' ; 

134 

13 5 end loop; 

13 6 end process; 
137 end rtl; 
10 138 

13 9 configuration tbc_rtl of derand_tb is 

14 0 for rtl 

141 for all : derand 

142 use entity work. derand (structure) ; 
15 143 end for; 

144 end for; 

145 end tbc__rtl; 



